REQUEST FOR FILING A PATENT APPLICATION UNDER 37 C.F.R. §1.53 (b) 

Attorney Docket: 147/37315CP 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE o 
i-. 

Box PATENT APPLICATION ^ 

Assistant Commissioner for Patents 

Washington, D.C. 20231 0 ^ 

no 

Sir: % 

This is a request for filing a divisional application under 
37 C.F.R. 1.53(b), of pending prior application: 

Serial No. 08/.922, 635 



Filed on 09/03/97 



Of John E. PILETZ et al . 



Entitled: DNA MOLECULES ENCODING IMIDAZOLINE 
RECEPTIVE POLYPEPTIDES AND POLYPEPTIDES ENCODED THEREBY 
Examiner: E. Lazar-Weslev 



Group : 164 6 



Batch No: W07 



Accompanying this order is a/true copy of the 
prior application as originally filed which 
includes a specification of 99 pages, 36 claims 
and 6 sheets of drawings of Figures 1-6. 

New Formal drawings are being filed herewith 
consisting of 2 sheet (s), depicting Figures 
6A-6B . 

Declaration and Power of Attorney: 

a. Newly executed (original or copy) 

b. X Copy from prior application. Also 
attached is a copy of the Substitute Power of 
Attorney which was filed on July 22, 1999 in 
prior application Serial No. 08/922,635. 

i. Deletion of Inventors - signed 

statement attached deleting 
inventor (s) named in the prior 
application 

Incorporation by Reference: 

The entire disclosure of the prior application, 
from which a copy of the oath or declaration is 
supplied under Box 3b, is considered as being 
part of the disclosure of the accompanying 



application and is hereby incorporated by 
reference therein. 



X 5. Cancel original claims 1-15 and 25-26 . 
6. Small entity status: 

a. A small entity statement is enclosed 

b. A statement of small entity status 

(copy attached) was filed on 

in the prior application and 

status as a small entity is still 
proper and desired. 

c. Is no longer claimed 

X 7. The filing fee is calculated below: 

CLAIMS AS FILED, INCLUDING ANY CLAIMS 
CANCELLED OR ADDED BY PRELIMINARY AMENDMENT 

Basic Fee $ 760.00 

Total Claims 23 - 20 = 3 x 9 = $_0 18 = $ 54 . 00 

Ind. Claims 5 - 3 = 2 x 39 = $ 0 7 8 = $ 78 . 00 

Multiple Dependent + 130 = $ 260 = $ 

Claims 

Total $ $ 892.00 

8. Please charge my Deposit Account No. 05-1323 

(Docket # ) in the amount of $ . 

X 9. A check in the amount of $ 8 92 . 00 to cover the 

filing fee is enclosed. 

X 10. The Commissioner is authorized to charge any fee 

which may be required under 37 CFR 1.16 or 37 CFR 
1.17 or credit any overpayment to Deposit Account 
No. 05-1323 (Docket 147/37315D2) . 

X 11. Amend the specification by inserting before the 

first line the sentence: --This application is a 
division of application Serial No. 09/922,635, 
filed September 3, 1997, which is a continuation- 
in-part of application Serial No. 08/650,766, 
filed May 20, 1996, which is related to 
provisional application Serial No. 60/12,600, 
filed March 1, 1996, abandoned. — 
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12. 



Priority of Appln. No(s) . , filed in 

on , is hereby claimed under 35 U.S.C. 

119. 



13. A certified copy of each said priority document 
was filed in application Serial No. on 



14. The prior application is assigned of record to 
The University of Mississippi Medical Center. 

15. The Substitute power of attorney in the prior 
application is to: 

Herbert I. Cantor, Reg. No. 24,392; James F. 
McKeown, Reg. No. 25,406; Donald D. Evenson, Reg. 
No. 26,160; Joseph D. Evans, Reg. No. 26,269; 
Gary R. Edwards, Reg. No. 31, 824; Jeffrey D. 
Sanok, Reg. No. 32,169;and James M. Verna, Reg. 
No. 33,287 

X a. The Substitute power appears in the prior 
application . 

b. Since the power does not appear in the 

original application papers, a copy of the 
power in the prior application is enclosed. 

c. Attached is a duplicate of a Supplemental 

Declaration which was filed in the prior 
application to overcome informalities . 

X d. Address all future correspondence to: 

J. D. Evans 
EVENSON, McKEOWN, EDWARDS 
& LENAHAN, P.L.L.C. 
1200 G Street, N.W. 

Suite 700 
Washington, DC 20005 

16. Forms PTO-892 and PTO-1449 listing 
prior art made of record in the prior 
application are attached. A copy of 
each of the listed references should be 
available in the prior application 
file. 

17. A Preliminary Amendment is being filed herewith. 

18. Return Receipt Postcard. 
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X 19. Attached is a letter requesting the use of the 

computer readable form of the Corrected Sequence 
Listing submitted on May 7, 1999 in pending prior 
application Serial No. 08/650,766 in this 
Division Application. 



_X 20. Other: Corrected Sequence Listing 

Substitute Drawing of Figures 6A-6B 



Respectfully submitted, 



October 8, 1999 




J.D. Evans 

Registration No. 26,269 



EVENSON, McKEOWN, EDWARDS 

& LENAHAN, P.L.L.C. 
1200 G Street, N.W., Suite 700 
Washington, D.C. 20005 
Tel: (202) 628-8800 
Fax: (202) 628-8844 

JDE : JMV : ate 
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Attorney Docket : 147 /37315D2 
PATENT 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicant: JOHN E. PILETZ ET AL . 



Serial No.: Divisional of 



08/922, 635 



Prior Group Art Unit: 164 6 



Filed: OCTOBER 8, 1999 



Prior Examiner: E. Lazar-Wesley 



Title: 



DNA MOLECULES ENCODING IMIDAZOLINE RECEPTIVE 
POLYPEPTIDES AND POLYPEPTIDES ENCODED THEREBY 



PRELIMINARY AMENDMENT 



Assistant Commissioner for Patents 
Washington, D.C. 20231 



Sir: 



Please enter the following amendments to the claims prior 
to the examination of the continued prosecution application. 

IN THE CLAIMS : 

Please amend Claims 16, 19, 22-24, 27, 32, 35 and 36 as 
follows : 

16. (Amended) An isolated polypeptide including a site 
which is receptive to imidazoline compounds, said polypeptide 
containing an amino acid sequence with at least [80] 90 % sequence 
[similarity] identity with the amino acid sequence shown in SEQ 
ID [No . 1 NO : 6 , wherein the percent identity is determined using 
the BLASTP program with default parameters . 

19. (Amended) An isolated polypeptide including a site 
which is receptive to imidazoline compounds, said polypeptide 
containing an amino acid sequence with at least [80] 90 % sequence 
[similarity] identity with the amino acid sequence shown in SEQ 
ID fNo . 1 NO : 5 , wherein the percent identity is determined using 
the BLASTP program with default parameters . 
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22. (Amended) A fragment of the amino acid sequence shown 
in SEQ ID [No.] NO: 5 or 6, which fragment is receptive to 
imidazoline compounds. 

23. (Amended) A polypeptide [according to any one] of 
claim[s] 16 [to 22] , which is immunoreactive with at least one 
of Reis antiserum and Dontenwill antiserum. 

24. (Amended) A polypeptide [according to any one] of 
claim [s] 16 [to 23], which is a human polypeptide. 

27 . (Amended) A method of screening for a ligand of an 
imidazoline receptor, which method comprises: 

culturing a host cell [as defined in claim 15] transf ected 
with a vector containing an isolated DNA molecule in a culture 
medium capable of [to] express ing a polypeptide including an 
amino acid sequence which is receptive to imidazoline compounds; 

contacting said polypeptide with a labelled ligand for the 
imidazoline receptor under conditions effective to bind the 
labelled ligand thereto; 

contacting said polypeptide with a candidate ligand; and 

detecting any displacement of the labelled ligand from said 
polypeptide, wherein displacement signifies that the candidate 
ligand is a ligand for the imidazoline receptor. 

32. (Amended) A method of obtaining a DNA material 
encoding a polypeptide which is receptive to imidazoline 
compounds, said method comprising: 

providing a labelled DNA probe by labelling a DNA molecule 
identical or complementary to a DNA molecule comprising a DNA 
sequence with at least 90% sequence identity with the DNA 
sequence shown in SEQ ID NO:l, 2, 3, 4 or 5, wherein the percent 
identity is determined using the BLASTN program with default 
parameters [as defined in any one of claims 1 to 9] or a 
[restriction] fragment thereof; 

contacting said DNA probe with genetic material suspected 
of encoding said imidazoline receptive polypeptide; 

hybridizing said DNA probe and said genetic material under 
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stringent hybridization conditions; 

identifying any portion of the genetic material which 
hybridizes to said DNA probe; and 

isolating said identified material. 

35. (Amended) A method according to claim 32, wherein the 
labelled DNA probe is provided by labelling a 1110 bp Apal-EcoRI 
restriction fragment from a DNA molecule comprising a DNA 
sequence with at least 90% sequence identity with the DNA 
sequence shown in SEP ID N0:1, 2, 3, 4 or 5, wherein the percent 
identity is determined using the BLASTN program with default 
parameters [according to claim 12 or 13] . 

36. (Amended) A method of raising antibodies 
immunoreactive with a polypeptide which is receptive to an 
imidazoline compound, which method comprises: 

injecting an animal with a polypeptide as defined in [any 
one of] claim[s] 16 [to 24 and 26] ; and 

isolating antibodies produced by the animal. 

Kindly add new Claims 37-40 as follows: 

--37. A polypeptide of claim 19, which is 

immunoreactive with at least one of Reis antiserum and Dontenwill 
antiserum. 

38. A polypeptide of claim 19, which is a human 
polypeptide . 

39. A method according to claim 32, wherein the labelled 
DNA probe is provided by labelling a 1.85 kb Apal-EcoRI 
restriction fragment from a DNA molecule comprising a DNA 
sequence with at least 90% sequence identity with the DNA 
sequence shown in SEQ ID N0:1, 2, 3, 4 or 5 and operably linked 
with a promoter sequence, wherein the percent identity is 
determined using the BLASTN program with default parameters. 

40. A method of raising antibodies immunoreactive with a 
polypeptide which is receptive to an imidazoline compound, which 
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method comprises : 

injecting an animal with a polypeptide as defined in claim 
19; and 

isolating antibodies produced by the animal . ~ 
REMARKS 

The above amendments have been made to reduce the number of 
claims prior to calculation of the filing fee and to eliminate 
the multiple claim dependencies. 

Favorable action on the application is earnestly solicited. 

If there are any questions regarding this Preliminary 
Amendment or this application in general, a telephone call to the 
undersigned would be appreciated since this should expedite the 
prosecution of the application for all concerned. 



It is respectfully requested that, if necessary to effect 
a timely response, this paper be considered as a Petition for an 
Extension of Time sufficient to effect a timely response and 
shortages in other fees, be charged, or any overpayment in fees 
be credited, to the Account of Evenson, McKeown, Edwards & 
Lenahan, P.L.L.C., Deposit Account No. 05-1323 (Docket 
#147/37315D2) . 



October 8, 1999 



spectfully submitted 



(Jan/es M. " Verna, Ph.D. 
gistration No. 33,287 



J. D. Evans 

Registration No. 26,26 



EVENSON, McKEOWN, EDWARDS 

& LENAHAN 
1200 G Street, N.W., Suite 700 
Washington, DC 20005 
Telephone No.: (202) 628-8800 
Facsimile No.: (202) 628-8844 
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DNA MOLECULES ENCODING IMIDAZOLINE RECEPTIVE POLYPEPTIDES 
AND POLYPEPTIDES ENCODED THEREBY 

REFERENCE TO RELATED APPLICATION 
The present application is a continuation-in-part of 
5 application Serial No. 08/650,766 filed May 20, 1996, which is 
related to provisional application Serial No. 60/12,600, filed 
March 1, 199 6. 

BACKGROUND OF THE INVENTION 
1 . Field of the Invention 
10_ The present invention is directed to DNA molecules encoding 

€1 imidazoline receptive polypeptides, preferably encoding human 
!=* imidazoline receptive polypeptides, that can be used as an 
{J! imidazoline receptor (abbreviated IR) . In addition, transcript (s) 
y and protein sequences are predicted from the DNA clones. The 
15^ invention is also directed to a genomic DNA clone designated as 
S JEP-1A. The cDNA clones according to the invention comprise cDNA 
% homologous to portion (s) of this genomic clone; including 5A-1 
%u cDNA, cloned by the inventors that established the open-reading 
frame for translation of mRNA from the gene, and established the 
2 0 immunoreactive properties of its polypeptide sequence in an 

expression systems. Also, the invention relates to cDNA clone 
EST04033, which is another clone identified to contain cDNA 
sequences from the JEP-1A gene, and of which the 5A-1 is a part, 
that encodes an active fragment of the IR polypeptide in 
25 transfection assays, and the protein sequences thereof. The 




invention also relates to methods for producing such genomic and 
cDNA clones, methods for expressing the IR protein and fragments, 
and uses thereof. 
2 . Description of Related Art 

It is believed that brainstem imidazoline receptors possess 
binding site(s) for therapeutically relevant imidazoline 
compounds, such as clonidine and idazoxan. These drugs represent 
the first generation of ligands discovered for the binding 
site(s) of imidazoline receptors. However, clonidine and 
idazoxan were developed based on their high affinity for a 2 - 
adrenergic receptors. Second generation ligands, such as 
moxonidine, possess somewhat improved selectivity for IR over a 2 - 
adrenergic receptors, but more selective compounds for IR are 
needed. 

An imidazoline receptor clone is of particular interest 
because of its potential utility in identifying novel 
pharmaceutical agents having greater potency and/ or more 
selectivity than currently available ligands have for imidazoline 
receptors. Recent technological advances permit pharmaceutical 
companies to use combinatorial chemistry techniques to rapidly 
screen a cloned receptor for ligands (drugs) binding thereto. 
Thus, a cloned imidazoline receptor would be of significant value 
to a drug discovery program. 

Until now, the molecular nature of imidazoline receptors 
remains unknown. For instance, no amino acid sequence data for a 
novel IR, e.g., by N-terminal sequencing, has been reported. 
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Three different techniques have been described in the literature 
by three different laboratories to visualize imidazoline- 
selective binding proteins (imidazoline receptor candidates) 
using gel electrophoresis. Some important consistencies have 
emerged from these results despite the diversity of the 
techniques employed. On the other hand, multiple protein bands 
have been identified, which suggests heterogeneity amongst 
imidazoline receptors. These reports are discussed below. 

Some of the abbreviations used hereinbelow, have the 
following meanings: 



a 2 AR Alpha-2 adrenoceptor 

BAC Bovine adrenal chromaffin 

ECL Enhanced chemi luminescence (protein detection 
procedure) 

EST Expressed Sequence Tag (a one-pass cDNA 

documentation without identification) 

I-site Any imidazoline-receptive binding site (e.g., 

encoded on IR) 

IR; Imidazoline receptor subtypej 

IR-Ab Imidazoline receptor antibody 

I 2 Site Imidazoline binding subtype 2 

kDa Kilodaltons (molecular size) 

MAO monoamine oxidase 

MW ' molecular weight 

NRL European abbreviation for RVLM (see below) 

PC-12 Phaeochromocytoma-12 cells 

125 PIC [ !25 l]p-iodoclonidine 

PKC Protein Kinase C 

RVLM Rostral Ventrolateral Medulla in brainstem 

SDS sodium dodecyl sulfate gel electrophoresis 



Reis et al. [Wang et al., Mol. Pharm. . 42: 792-801 (1992); 
Wang et al., Mol. Pharm. , 43: 509-515 (1993)] were the first to 
characterize an imidazoline-selective binding protein and to 
demonstrate it as having MW = 7 0 kDa. This was accomplished 
using bovine cells (BAC) , which lack an a 2 AR [Powis & Baker, Mol . 



3 



Pharm. , 29:134-141 (1986)]. The 70 kDa imidazoline-selective 
protein in those studies had high affinities for both idazoxan 
and p-aminoclonidine affinity chromatography columns and was 
eluted by another imidazoline compound (phentolamine) . 
Unfortunately, those investigators failed to isolate sufficient 
70 kDa protein to determine its other biochemical properties. To 
date, no one has reported the complete purification of an 
imidazoline receptor protein. Likewise, no amino acid sequences 
have been reported for IR. 

Their 7 0 kDa protein was used by Reis and co-workers to 
raise "I-site binding antiserum", designated herein as Reis 
<0 antiserum. The term "I-site" refers to the imidazoline binding 
M> site, presumably defined within the imidazoline receptor protein, 
ijl Reis antiserum was prepared by injecting the purified protein 
y into rabbits [Wang et al, 1992]. The first immunization was done 

subcutaneous ly with the protein antigen (10 £ig) emulsified in an 
S equal volume of complete Freund's adjuvant, and the next three 
% booster shots were given at 15-day intervals with incomplete 
Freund's adjuvant. The polyclonal antiserum has been mostly 
characterized by immunoblotting , but radioimmunoassays (RIA) 
and/or conjugated assay procedures, i.e., ELISA assays, are also 
conceivable [see "Radioimmunoassay of Gut Regulatory Peptides: 
Methods in Laboratory Medicine," Vol. 2, chapters 1 and 2, 
Praeger Scientific Press, 1982]. 

The present inventors and others [Escriba et al., Neurosci . 
Lett. 178: 81-84 (1994)] have characterized the Reis antiserum in 




several respects. For instance, the present inventors have 
discovered that human platelet immunoreactivity with Reis 
antiserum is mainly confined to a single protein band of MW x 33 
kDa, although a trace band at « 85 kDa was also observed. The « 
3 3 and « 85 kDa bands were enriched in plasma membrane fractions 
as expected for an imidazoline receptor. Furthermore, the 
intensity of the « 33 kDa band was found to be positively 
correlated with non-adrenergic 125 PIC Birtax values at platelet IRj 
sites in samples from the same subjects, with an almost 
one-to-one slope factor. In addition, the nonadrenergic 125 PIC 
binding sites on platelets were discovered by the present 
inventors to have the same rank order of affinities as IRj 
binding sites in brainstem [Piletz and Sletten, J.Pharm. & Exper. 
Therap. , 267: 1493-1502 (1993)]. The platelet ss 33 kDa band may 
also be a product of a larger protein, since in human 
megakaryoblastoma cells, which are capable of forming platelets 
in tissue cultures, an « 85 kDa immunoreactive band was found to 
predominate . 

Immunoreactivity with Reis antiserum does not appear to be 
directed against human a 2 AR and/or MAO A/B. This is a 
significant point because a 2 AR and MAO A/B have previously been 
cloned and also bind to imidazolines. The present inventors have 
obtained selective antibodies and recombinant preparations for 
a 2 AR and MAO A/B, and these proteins do not correspond to the ~ 
33, 70, or 8 5 kDa putative IR, bands. Thus, there is substantial 
evidence that, at least in human platelets, the Reis antiserum is 
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IRj selective. 

Another antiserum was raised by Drs. Dontenwill and Bousquet 
in France [Greney et al. , Europ. J. Pharmacol. , 265: R1-R2 
(1994); Greney et al., Neurochem. Int. , 25: 183-191 (1994); 
Bennai et al., Annals NY Acad. Sci . . 763:140-148 (1995)] against 
polyclonal antibodies for idazoxan (designated Dontenwill 
antiserum) . This anti-idiotypic antiserum inhibits 3 H-clonidine 
but not 3 H-rauwolscine (a 2 -selective) binding sites in the 
brainstem, suggesting it also interacts with IRj [Bennai et al., 
1995]. As shown in Fig. 1, human RVLM (same as NRL) membrane 
fractions displayed bands of ~ 41 and 4 4 kDa, as detected by the 
present inventors using this anti-idiotypic antiserum. 

The present inventors have found that the bands of MW » 41 
and 44 kDa detected by Dontenwill antiserum may be derived from 
an « 85 kDa precursor protein, similar to that occurring in 
platelet precursor cells. An 85 kDa immunoreactive protein is 
obtained in fresh rat brain membranes only when a cocktail of 11 
protease inhibitors is used. Also, as shown in Fig. 1, it is 
found that Reis antiserum detects the « 41 and 4 4 kDa bands in 
human brain when fewer protease inhibitors are used. 
Additionally, the Dontenwill antiserum weakly detects a platelet 
« 33 kDa band. Thus, the present inventors have hypothesized 
that the « 41 and 44 kDa immunoreactive proteins may be 
alternative breakdown products of an « 85 kDa protein, as opposed 
to the platelet ~ 3 3 kDa breakdown product. 

In summary, the main conclusion from the above results is 
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that, despite vastly different origins, the Reis and Dontenwill 
antisera both detect identical bands in human platelets, RVLM, 
and hippocampus . 

Using yet another technique, a photoaf f inity imidazoline 
ligand, 123 AZIPI, has also been developed to preferentially label 
I 2 -imidazoline binding sites [Lanier et al., J . Biol . Chem. , 268: 
16047-16051 (1993)]. The 125 AZIPI photoaf f inity ligand was used 
to visualize « 55 kDa and » 61 kDa binding proteins from rat 
liver and brain. It is believed that the » 61 kDa protein is 
probably MAO, in agreement with other findings [Tesson et al., 
J. Biol. Chem. . 270: 9856-9861 (1995)] showing that MAO proteins 
bind certain imidazoline compounds. The different molecular 
weights between these bands and those detected immunologically by 
the present inventors is one of many pieces of evidence that 
distinguishes IRj from I 2 sites. 

To the inventors' knowledge and as described herein, we are 
first to clone the gene, cDNAs and fragments thereof encoding a 
protein with the immunological and ligand binding properties 
expected of an IR. On this basis, we are first to identify the 
nucleotide sequences of DNA molecules encoding an imidazoline 
receptor and active fragments thereof, and the first to determine 
the amino acid sequence of an imidazoline receptor and active 
fragments thereof. The polypeptides described herein are clearly 
distinct from a 2 AR or MAO A/B proteins. 
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SUMMARY OF THE INVENTION 

The present invention involves various cDNA clones (ie., 5A- 
1 and EST04033) and a genomic clone (JEP-1A) which are directed 
to an isolated polypeptide (s) that is receptive to (bind to) 
imidazoline compound (s) , and can be used to identify other 
compounds of interest. Currently available imidazoline compounds 
in this context are p-iodoclonidine and moxonidine. Initially, 
the inventors detected a polypeptide expressed by their cDNA 
clone (5A-1 isolated from a human hippocampus cDNA library) that 
immunoreacted with Reis antiserum and/or Dontenwill antiserum. 
The DNA sequence of the 5A-1 clone is encapsulated within a 
portion of the other clones (EST04033 and JEP-1A genomic clone) . 

In one aspect of the invention, a polypeptide includes a 
651 amino acid sequence as shown in SEQ ID No. 5. This 
polypeptide is predicted from non-plasmid cDNA in EST04033; a 
clone which the inventors showed possesses sequences inclusive of 
5A-1. Furthermore, transfection of EST04033 into COS cells 
yielded imidazoline receptivity by radioligand binding assays 
(described in detail later) . Other imidazoline receptive 
proteins homologous to this polypeptide are also contemplated. 
Such polypeptide (s) generally have a molecular weight of about 50 
to 80 kDa. More particularly, one can have a molecular weight of 
about 7 0 kDa. 

In another aspect of this invention, a polypeptide includes 
a 390 amino acid sequence as shown in SEQ ID No. 6. This 
represents the polypeptide predicted from the non-plasmid DNA of 
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the original 5A-1 clone. Such a polypeptide generally has a 
molecular weight of about 35 to 50 kDa. More particularly, it 
can have a molecular weight of about 43 kDa. 

DNA molecules encoding aforementioned imidazoline-receptive 
polypeptide (s) are also contemplated. Such a DNA molecule, e.g., 
a cDNA derived from mRNA, can contain a nucleotide sequence 
encoding the 651 amino acid sequence shown in SEQ ID No. 5. 
Thus, a DNA molecule containing the 1954 base pairs (b.p.) (1954 
b.p. encodes 651 amino acids) nucleotide sequence shown in SEQ ID 
No. 2 is contemplated. This represents the coding sequence for 
the polypeptide predicted by EST04 033 transf ections . In another 
embodiment, a DNA molecule includes the longer nucleotide 
sequence shown in SEQ ID No. 3. This represents the cDNA 
predicted to have been translated + not predicted to have been 
translated in transf ections experiments of EST04033. 

In another embodiment of the invention, a DNA molecule 
contains a nucleic acid sequence encoding the amino acid sequence 
shown in SEQ ID No. 6. In another aspect, it can include the 
1171 b.p. nucleic acid sequence shown in SEQ ID No. 4. The 1171 
b.p. nucleic acid sequence shown in SEQ ID No. 4 is the 5A-1 non- 
plasmid DNA. 

The nucleic acid sequence of the genomic clone encoding the 
imidazoline receptor is further shown in SEQ ID No. 21. The 
nucleic acid and amino acid sequence of the predicted transcript 
(ie., cDNA) can be predicted from the description hereinbelow. 
The polypeptide encoded by the genomic DNA is shown in SEQ ID No. 
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22. 

Sequence similarity with the sequences indicated in SEQ ID 
protocols of the attached Sequence Listing is defined in 
connection with the present invention as a very close structural 
relationship of the relevant sequences with the sequences 
indicated in the respective SEQ ID protocols. To determine the 
sequence similarity, in each case the structurally mutually 
corresponding sections of the sequence of the SEQ ID protocol and 
of the sequence to be compared therewith are superimposed in such 
a way that the structural correspondence between the sequences is 
a maximum, account being taken of differences caused by deletion 
or insertion of individual sequence members (DNA-codon or amino 
acid respectively) , and being compensated by appropriate shifts 
in sections of the sequences. The sequence similarity in % 
results from the number of sequence members which now correspond 
to one" another in the sequences ("homologous positions") relative 
to the total number of members contained in the sequences of the 
SEQ ID protocols. Differences in the sequences may be caused by 
variation, insertion or deletion of sequence members. 
Additionally in DNA sequences, different DNA-codons encoding for 
the same amino acid are considered identical in the context of 
the present invention. For amino acid sequences, conservative 
amino acid substitutions encoded by their corresponding DNA- 
codons, as well as naturally occurring homologs of the sequences, 
are considered within the context of sequence similarity. 

DNA molecules of substantial homology (> 75 %) are an 
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implicit aspect of this sort of invention. As will be discussed 
later, the inventors have already identified two possible splice 
variants in the amino acid coding sequence. In addition, 
artificially mutated receptor cDNA molecules can be routinely 
constructed by methods such as site-directed polymerase chain 
reaction-mediated mutagenesis [Nelson and Long, Anal. Biochem . 
180: 147-151 (1989)]. It is commonly appreciated that highly 
homologous mutants frequently mimic their natural receptor. A 
study by Kjelsberg et al. [J. Biol. Chem. 267: 1430-1433 (1992)] 
showed that all 2 0 amino acid substitutions produce an active 
receptor at a single site in the a lb -adrenergic receptor. RNA 
molecules of > 75 % complementarity to an instant DNA molecule, 
e.g., an mRNA molecule (sense) or a complementary cRNA molecule 
(antisense) , are a further aspect of the invention. 

A further aspect of the invention is for a recombinant 
vector, as well as a host cell transfected with the recombinant 
vector, wherein the recombinant vector contains at least one of 
the nucleotide sequences shown in SEQ ID Nos. 1-4, or sequences 
predicted by the genomic clone, or nucleotide sequences > 75 % 
homologous thereto. 

A method of producing an imidazoline receptor protein is 
another aspect of the invention. Such a method entails 
transfecting a host cell with an aforementioned vector, and 
culturing the transfected host cell in a culture medium to 
generate the imidazoline receptor. 

A method for producing homologous imidazoline receptor 

11 




proteins, and the proteins produced thereby, are also considered 
an aspect of this invention. 

A significant further aspect of the invention is a method of 
screening for a ligand that binds to an imidazoline receptor. 
Such a method can comprise culturing an above-mentioned 
transfected cell in a culture medium to express imidazoline 
receptor proteins, followed by contacting the proteins with a 
labelled ligand for the imidazoline receptor under conditions 
effective to bind the labelled ligand thereto. The imidazoline 
receptor proteins can then be contacted with a candidate ligand, 
and any displacement of the labelled ligand from the proteins can 
be detected. Displacement of labelled ligand signifies that the 
candidate ligand is a ligand for the imidazoline receptor. These 
steps could be performed on intact host cells, or on proteins 
isolated from the cell membranes of the host cells. 

The invention will now be described in more detail with 
reference to specific examples. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 depicts a comparison of Reis antiserum (lane 1, 
1:2000 dilution) and Dontenwill antiserum (lane 2, 1:5000 
dilution) immunoreactivities for human NRL (same as RVLM) and 
hippocampus, as discussed in Example" 1. 

Fig. 2 depicts a comparison of Reis antiserum (1:15,000 
dilution) and Dontenwill antiserum (1:20,000 dilution) 
immunoreactivities for plaques isolated from the human 
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hippocampal cDNA library used in cloning as discussed in Example 
2. The plaques contain the initial clone, designated herein as 
5A-1, in a third stage of purification. 

Fig. 3 depicts the restriction map of the EST04 03 3 cDNA 
clone. 

Fig. 4 depicts a competitive binding assay between 123 I- 
labelled p-iodoclonidine (PIC) and various ligands for the 
imidazoline receptor on membranes expressed in COS cells 
transfected with the EST04033 cDNA clone, as discussed in Example 
4. 

Fig. 5 depicts the prediction of introns and exons of the 
genomic clone (as analyzed by the GENESCAN program and verified 
by the available CDNAS) . 

Fig. 6 depicts the distribution of MRNA homologous to our 
CDNA in human adult tissues (bar graph) and the two species of 
MRNA (6 and 9 . 5 kb) . 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention is concerned with multiple aspects of 
an imidazoline receptor protein, and DNA molecules encoding the 
same, and fragments thereof, which have now been discovered. 

First, a polypeptide having imidazoline binding activity has 
been identified, which contains the putative active site for 
binding, as discussed hereinafter. Although polypeptide (s) 
described herein has a binding affinity for an imidazoline 
compound, it may also have an enzymatic activity, such as do 
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catalytic antibodies and ribozymes. In fact, one such domain 
within our protein predicts a cytochrome p45 0 activity (described 
later) . 

Exemplary "binding" polypeptides are those containing either 
of the amino acid sequences shown in SEQ ID Nos. 5 or 6 (with the 
amino acid sequence predicted by EST04033 given in SEQ ID No. 5) . 
Functionally equivalent polypeptides are also contemplated, such 
as those having a high degree of homology with such 
aforementioned polypeptides, particularly when they contain the 
Glu-Asp-rich region described hereinafter which we believe may 
define an active imidazoline binding site. 

A polypeptide of the invention can be formed by direct 
chemical synthesis on a solid support using the carbodiimide 
method [R. Merrifield, JACS, 85: 2143 (1963)]. Alternatively, 
and preferably, an instant polypeptide can be produced by a 
recombinant DNA technique as described herein and elsewhere 
[e.g., U.S. Patent No. 4,740,470 (issued to Cohen and Boyer) , the 
disclosure of which is incorporated herein by reference] , 
followed by culturing transf ormants in a nutrient broth. 

Second, a DNA molecule of the present invention encodes 
aforementioned polypeptide. Thus, any of the degenerate set of 
codons encoding an instant polypeptide is contemplated. A 
particularly preferred coding sequence is the 1954 b.p. sequence 
set forth in SEQ ID No. 2, which has now been discovered to be a 
nucleotide sequence that encodes a polypeptide capable of binding 
imidazoline compound (s) . In another embodiment, a DNA molecule 
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includes the 3318 b.p. nucleotide sequence shown in SEQ ID No. 3. 
This latter sequence is the entire EST04 03 3 insert. It includes 
the nucleotide sequence of SEQ ID No. 2 which was predicted to 
have been translated into protein in the transfection 
experiments . 

In another embodiment of the invention, a DNA molecule 
contains a nucleic acid sequence encoding the amino acid sequence 
(390 residues) shown in SEQ ID No. 6. This amino acid sequence 
corresponds to that derived from direct sequencing of the 5A-1 
clone and represents a fragment of the native protein. The 5A-1 
DNA molecule is defined by the 1171 b.p. nucleic acid sequence 
shown in SEQ ID No . 4 . 

A DNA molecule of the present invention can be synthesized 
according to the phosphotriester method [Matteucci et al., JACS, 
103: 3185 (1988)]. This method is particularly suitable when it 
is desired to effect site-directed mutagenesis of an instant DNA 
sequence, whereby a desired nucleotide substitution can be 
readily made. Another method for making an instant DNA molecule 
is by simply growing cells transformed with plasmids containing 
the DNA sequence, lysing the cells, and isolating the plasmid DNA 
molecules. Preferably, an isolated DNA molecule of the invention 
is made by employing the polymerase chain reaction (PCR). [e.g., 
U.S. Patent No. 4,683,202 (issued to Mullis) ] using synthetic 
primers that anneal to the desired DNA sequence, whereby DNA 
molecules containing the desired nucleotide sequence are 
amplified. Also, a combination of the above methods can be 



15 



employed, such as one in which synthetic DNA is ligated to CDNA 
to produce a quasi-synthetic gene [e.g., U.S. Patent No. 
4,601,980 (issued to Goeddel et al.)]. 

A further aspect of the invention is for a vector, e.g., a 
plasmid, that contains at least one of the nucleotide sequences 
shown in SEQ ID Nos. 1-4 or those predicted by the genomic clone 
in SEQ ID No. 21. Whenever the reading frame of the vector is 
appropriately selected, the vector encodes an IR polypeptide of 
the invention. Hence, as well as full-length protein, fragments 
of the native IR protein are contemplated; as well as fusion 
proteins that incorporate an amino acid sequence as described 
herein. Also, a vector containing a nucleotide sequence having a 
high degree of homology with any of SEQ ID Nos. 1-4 or 21 is 
contemplated within the invention, particularly when it' encodes a 
protein having imidazoline binding activity. 

A recombinant vector of the invention can be formed by 
ligating an afore-mentioned DNA molecule to a preselected 
expression plasmid, e.g., with T4 DNA ligase. Preferably, the 
plasmid and DNA molecule are provided with cohesive (overlapping) 
terminii, with the plasmid and DNA molecule operatively linked 
(i.e., in the correct reading frame). 

Another aspect of the invention is a host cell transfected 
with a vector of the invention. ' Relatedly, a protein expressed 
by a host cell transfected with such a vector is contemplated, 
which protein may be bound to the cell membrane. Such a protein 
can be identical with an aforementioned polypeptide, or it can be 
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a fragment thereof, such as when the polypeptide has been 
partially digested by a protease in the cell. Also, the 
expressed protein can differ from an aforementioned polypeptide, 
as whenever it has been subjected to one or more post- 
translational modifications. For the protein to be useful within 
the context of the present invention, it should exhibit 
imidazoline binding capacity. 

A method of producing an imidazoline receptor protein is 
another aspect of the invention, which entails transfecting a 
host cell with an aforementioned vector, and culturing the 
transfected host cell in a culture medium to generate the 
imidazoline receptor. The receptor molecule can undergo any 
post-translational modification (s) , including proteolytic 
decomposition, whereby its structure is altered from the basic 
amino acid residue seguence encoded by the vector. A suitable 
transfection method is electroporation, and the like. 

With respect to transfecting a host cell with a vector of 
the invention, it is contemplated that a vector encoding an 
instant polypeptide can be transfected directly in animals. For 
instance, embryonic stem cells can be transfected, and the cells 
can be manipulated in embryos to produce transgenic animals. 
Methods for performing such an operation have been previously 
described [Bond et al., Nature , 374:272-276 (1995)]. These 
methods for expressing an instant CDNA molecule in either tissue 
culture cells or in animals can be especially useful for drug 
discovery. 
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Possibly the most significant aspect of the present 
invention is in its potential for affording a method of screening 
for a ligand (drug) that binds to an imidazoline receptor. Such 
a method comprises culturing an above-mentioned host cell in a 
culture medium to express an instant imidazoline receptive 
polypeptide, then contacting the polypeptides with a labelled 
ligand, e.g., radiolabeled p-iodoclonidine, for the imidazoline 
receptor under conditions effective to bind the labelled ligand 
thereto. The polypeptides are further contacted with a candidate 
ligand, and any displacement of the labelled ligand from the 
polypeptides is detected. Displacement signifies that the 
|j candidate ligand actually binds to the imidazoline receptor. 
£ These steps could be performed on intact host cells, or on 
:|I proteins isolated from the cell membranes of the host cells. 
°r\ Typically, a suitable drug screening protocol involves 

* preparing cells (or possibly tissues from transgenic animals) 
D that express an instant imidazoline receptive polypeptide. In 
yjl this process, categories of chemical structure are systematically 
5(1 screened for binding affinity or activation of the receptor 
molecule encoded by the transfected CDNA. This process is 
currently referred to as combinatorial chemistry. With respect 
to the imidazoline receptor, a number of commercially available 
radioligands, e.g., I25 PIC, can be used for competitive drug 
binding affinity screening. 

An alternative approach is to screen for drugs that elicit 
or block a second messenger effect known to be coupled to 
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activation of the imidazoline receptor, e.g., moxonidine- 
stimulated arachidonic acid release. Even with a weak binding 
affinity or activation by one category of chemicals, systematic 
variations of that chemical structure can be studied and a 
preferred compound (drug) can be deduced as being a good 
pharmaceutical candidate. Identification of this compound would 
lead to animal testing and upwards to human trials. However, the 
initial rationale for drug discovery becomes vastly improved with 
an instant cloned imidazoline receptor. 

Along these lines, a drug screening method is contemplated 
in which a host cell of the invention is cultured in a culture 
medium to express an instant imidazoline receptive polypeptide. 
Intact cells are then exposed to an identified agent (ie., 
agonist, inverse agonist, or antagonist) under conditions 
effective to elicit a second messenger or other detectable 
responses upon interacting with the receptor molecule. The 
imidazoline receptive polypeptides are then contacted with one or 
more candidate chemical compounds (drugs) , and any modification 
in a second messenger response is detected. Compounds that mimic 
an identified agonist would be agonist candidates, and those 
producing the opposite response would be inverse agonist 
candidates. Those compounds that block the effects of a known 
agonist would be antagonist candidates for an in vivo imidazoline 
receptor. For meaningful" results, the contacting step with a 
candidate compound is preferably conducted at a plurality of 
candidate compound concentrations. 
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A method of probing for another gene encoding an imidazoline 
receptor or homologous protein is further contemplated. Such a 
method comprises providing a radiolabelled DNA molecule identical 
or complementary to one of the above-described CDNA molecules 
(probe) . The probe is then placed in contact with genetic 
material suspected of containing a gene encoding an imidazoline 
receptor or encoding a homologous protein, under stringent 
hybridization conditions (e.g., a high stringency wash condition 
is 0.1 x SSC, 0.5% SDS at 65°C), and identifying any portion of 
the genetic material that hybridizes to the DNA molecule. 

Still further, a method of selectively producing antibodies, 
(e.g., monoclonal antibodies, immunoreactive with an instant 
imidazoline-receptive protein) comprises injecting a mammal with 
an aforementioned polypeptide, and isolating the antibodies 
produced by the mammal. This aspect is discussed in more detail 
in an example presented hereinafter. 

The present inventors began their search for a human 
imidazoline receptor CDNA by screening a Xgtll phage human 
hippocampus CDNA expression library. Their research had 
indicated that both of the known antisera (Reis and Dontenwill) 
that are directed against human imidazoline receptors were 
immunoreactive with identical bands on SDS gels of membranes 
prepared from the human hippocampus (an in other tissues) . By 
contrast, other brain regions either were commercially 
unavailable as cDNA expression libraries or yielded 
inconsistencies between the two antisera. Therefore, it was felt 
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that a human hippocampal cDNA library held the best opportunity 
for obtaining a CDNA for an imidazoline receptor. 
Immunoexpression screening was chosen over other cloning 
strategies because of its sensitivity when coupled with the ECL 
detection system used by the present inventors, as discussed 
hereinbelow. 

A number of unique discoveries led to identifying the first 
5A-1 clone as an imidazoline receptor CDNA. These included 
discoveries that led to the choice of a hippocampal CDNA library 
and adapting ECL to the antisera. Once the initial clone (5A-1) 
was identified and sequenced, a more complete clone (EST04 03 3) 
was purchased without restriction from ATCC Inc. (Catalogue # 
82815; American Type Culture Collection, Rockville, MD) . EST 
04033 was the only EST clone available at the time of the 
discovery of 5A-1, that contained a segment of complete homology 
(the origination of EST04033 is discussed later on) . The binding 
affinities of the expressed protein after transfection in COS 
cells were determined by radioligand binding procedures developed 
in the inventor's laboratory [Piletz and Sletten, 1993, ibid]. 

To identify an instant CDNA clone encoding an imidazoline 
receptor it was preferable to employ both of the known antibodies 
to imidazoline receptors. These antibodies were obtained by 
contacting Dr. D. Reis (Cornell University Medical Center, New 
York City), and Drs. M. Dontenwill and P. Bousquet (Laboratoire 
de Pharmacologic Cardiovascular et Renale, CNRS, Strasbourg, 
France) . These antisera were obtained free of charge and without 
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confidentiality or restrictions on their use. The former 
antiserum (Reis antiserum) was derived from a published 
imidazoline receptor protein [Wang et al., (1992, 1993}, the 
disclosures of which are incorporated herein by reference] . The 
method for raising the latter antiserum (Dontenwill antiserum) 
has also been published [Bennai et al., (1995), the disclosure of 
which is also incorporated herein by reference]. The latter 
antiserum was developed using an anti-idiotypic approach that 
identified the pharmacologically correct (clonidine and idazoxan 
selective) binding site structure. 

Example 1. Selectivity of the Antisera . 

The obtained Reis antiserum had been prepared against a 
purified imidazoline binding protein isolated from BAC cells, 
which protein runs in denaturing-SDS gels at 70 Kda [Wang et al., 
1992, 1993]. The Dontenwill antiserum is anti-idiotypic, and 
thus is believed to detect the molecular configuration of an 
imidazoline binding site domain in any species. Prior to being 
used for screening plagues, both antisera were cleaned by 
stripping out possible antibacterial antibodies. 

Both antisera have been tested to ensure that they are in 
fact selective for a human imidazoline receptor. In particular, 
we found that both of these antisera detected identical bands in 
human platelets and hippocampus; and in brainstem RVLM (NRL) by 
Western blotting (see Fig. l) . In these studies, in order to 
increase sensitivity over previously published detection methods, 
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an ECL (Enhanced Chemi luminescence) system was employed. The 
linearity of response of the ECL system was demonstrated with a 
standard curve. ECL detection was demonstrated to be very 
quantifiable and about ten times more sensitive than other 
screening methods previously used with these antisera. Western 
blots with antiserum dilutions of 1:3000 revealed 
immunoreactivity with as little as 1 ng of protein from a human 
hippocampal homogenate by dot blot analysis. 

For the studies depicted in Fig. 1, human hippocampal 
homogenate (3 0^tg) and NRL membrane proteins (10^g) were 
electrophoresed through a 12.5% SDS-polyacrylamide gel, 
electrotransfered to nitrocellulose and sequentially incubated 
with (1) the Reis antibody (1:2000 dilution) and (2) the 
Dontenwill antibody (1:5000 dilution). Immunoreactive bands were 
visualized with an Enhanced Chemiluminescence (ECL) detection kit 
(Amersham) using anti-rabbit Ig-HRP conjugated antibody at a 
dilution of 1:3000 and the ECL detection reagents. Following 
detection with the antibody, blots were stripped and reprocessed 
omitting the primary antibody to check for complete removal of 
this antibody. In panels A and B, lane 1 shows the 
immunoreactive bands observed with the Reis antibody and lane 2 
shows the bands detected with the Dontenwill antibody. Protein 
molecular weight standards are indicated to the left of each 
panel (in Kda) . 

Despite the diverse origins of the Reis and Dontenwill 
antisera, both of these antisera detected a similar 85 Kda 
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protein in human brain and other tissues. But, a 33 Kda band was 
found in human platelets. Although the 3 3 Kda band is of smaller 
size than that reported for other tissues [Wang et al., 1993; 
Escriba et al. f 1994; Greney et al., 1994], the fact that both 
antisera detected it, suggests that both the 85 Kda and 33 Kda 
bands may be imidazoline binding polypeptides. The 85 and 33 Kda 
bands were enriched in plasma membrane fractions, as is known to 
be the case for IR r binding, but not I 2 binding [Piletz and 
Sletten, 1993]. 

A significant positive correlation was established for the 
85 Kda band detected by the Dontenwill antiserum with IRj Bmax 
values across nine rat tissues (r 2 = 0.8736). A similar 
positive correlation was established amongst platelet samples 
from 15 healthy platelet donors between radioligand IR X Bmax 
values (but not I 2 or a 2 AR Bmax values) , and the 3 3 Kda band 
(presumed 1^ immunoreactivity) on Western blots. This 
correlation exhibited a slope factor close to unity (results not 
shown) . These correlations strongly suggested that an IRj 
binding protein might be revealed in an imidazoline receptor- 
antibody Western blotting assay. Furthermore, the Reis antiserum 
failed to detect authentic a 2 AR, MAO A, or MAO B bands on gels, 
i.e., it was not immunoreactive with MAO at MW = 61 Kda, or a 2 AR 
at MW = 64 Kda. Additionally, no immunoreactive bands were 
observed using preimmune antiserum. Thus, after extensively 
characterizing these antisera with human and rat materials, it 
was concluded that these antisera are indeed selective for human 
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imidazoline receptor protein. 

Exampl e 2. Cloning of cDNA For An Imidazoline Receptor 

A commercially available human hippocampal cDNA Xgtll 
expression library (Clontech Inc., Palo Alto, CA) was screened 
for immunoreactivity sequentially using both the anti-idiotypic 
Dontenwill antiserum and the Reis antiserum. Standard techniques 
to induce protein and transference to a nitrocellulose overlay 
were employed. [See, for instance, Sambrook et al., 1989, 
"Molecular Cloning: A Laboratory Manual," Cold Spring Harbor 
Laboratory Press]. After washing and blocking with 5% milk, the 
Dontenwill antiserum was added to the overlay at 1:20,000 
dilution in Tris-buf f ered saline, 0.05% Tween20, and 5% milk. 
The Reis antiserum was employed similarly, but at 1:15,000 
dilution. These high dilutions. "of primary antiserum were chosen 
to avoid false positives. The secondary antibody was added, and 
positive plaques were identified using ECL. Representative 
results are shown in Fig. 2. 

Positive plaques were pulled and rescreened until tertiary 
screenings yielded only positive plaques. Four separate positive 
plaques were identified from more than 300,000 primary plaques in 
our library. Recombinant Xgtll DNA purified from each of the 
four plaques was subsequently subcloned into ' JE^ coli pBluescript 
vector (Stratagene, La Jolia, CA) . Sequencing of these four cDNA 
inserts in pBluescript demonstrated that they were identical, 
suggesting that only one cDNA had actually been identified four 
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times. Thus, the screening had been verified as being highly- 
reproducible and the frequency of occurrence was as expected for 
a single copy gene, i.e., one in 75,000 transcripts. As shown in 
Fig. 2, the protein produced by the first positive clone, 
designated 5A-1, tested positive with both the Reis antiserum and 
the Dontenwill antiserum. Clone 5A-1 has been deposited under the 
Budapest Treaty with the American Type Culture Collection (ATCC) , 
12301 Parklawn Drive, Rockville, MD, USA, 20852, on August 28, 
1997 and has been assigned deposit accession no. ATCC 209217. 
Tertiary-screened plaques of 5A-1 were all immuno-positive with 
either of the two known anti-imidazoline receptor antisera, but 
not with either preimmune antisera. These results suggested that 
clone 5A-1 encoded a fusion peptide similar to or identical with 
one of the predominant bands detected in human Western blots by 
both the Dontenwill and Reis antisera. 

Sequencing of the first four clones was performed by 
contracting with ACGT Company (Chicago, IL) after subcloning them 
into pBluescript vector SK (Stratagene) . Both manual and 
automatic sequencing strategies were employed which are outlined 
as follows: 
Manual Sequencing 

1. DNA sequencing was performed using T7 DNA polymerase and 
the dideoxy nucleotide termination reaction. 

2. The primer walking method [Sambrook et al., ibid . ] was 
used in both directions. 

3. ( 35 S)dATP was used for labelling. 
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4. The reactions were analyzed on 6% polyacrylamide wedge or 
non-wedge gels containing 8 M urea, with samples being loaded in 
the order of A C G T. 

5. DNA sequences were analyzed by MacVector Version 5.0. 
and by various Internet-available programs, i.e., the BLAST 
program. 

Automatic Sequencing 

1. - DNA sequencing was performed by the fluorescent dye 
terminator labelling method using AmpliTaq DNA polymerase 
(Applied Biosystems Inc., Prizm DNA Sequencing Kit, Perkin-Elmer 
Corp., Foster City, CA) . 

2. The primer walking method was used. The primers actually 
used were a subset of those shown in SEQ ID Nos . 7-20. 

3. Sequencing reactions were analyzed on an Applied 
Biosystems, Inc. (Foster City, CA) sequence analyzer. 

These results demonstrated that the initial clone (5A-1) 
contained a 1171 base pair insert (see SEQ ID No. 4) . The entire 
5A-1 cDNA was found to exist as extended open reading frame for 
translation into protein. Consequently, it was determined that 
the 5A-1 cDNA must be a fragment of a larger mRNA. 

cDNA Sequence Homologies 

Using programs and databases available on the Internet 
(retrieved from NCBI Blast E-mail Server address 
blast§ncbi. nlm.nih.gov) , it was determined that the 5A-1 clone 
encodes a previously undefined unique molecule. The BLASTP 




program [1.4.8MP, 20-June-1995 (build 11/13/95)] was used to 
compare all of the possible frames of amino acid sequences 
encoded by 5A-1 versus all known amino acid sequences available 
within multiple international databases [Altschul et al., J. Mol. 
Biol . , 215: 403-410 (1990)]. Only one protein, from Micrococcus 
luteus , possessed a marginally significant homology (p=0 . 04 ) (41%) 
over a short stretch of 75 of the 390 amino acids encoded by 5A- 
1. Otherwise, there were not any amino acid homologies (i.e., 
with p < 0.05) for any known proteins. Therefore, the protein 
encoded by 5A-1 is not significantly related to MAO A or B, o. 2 KR, 
or any other known eukaryotic protein in the literature. 

In contrast to the amino acid search on BLASTP, two nearly 
homologous EST cDNA sequences of undefined nature covering 155 
and 250 b.p. of the 5A-1 clone were reported to exist using 
BLASTN (reached from the same Internet server on 11/13/95) . 
BLASTN is a program used to compare known DNA sequences from 
international databases, regardless of whether they encode a 
polypeptide. Neither of the two EST cDNA sequences having high 
homology to 5A-1, to our knowledge have been reported anywhere 
else except on the Internet. Both were derived as Expressed 
Sequence Tags (ESTs) in random attempts to sequence the human 
cDNA repertoire [as described in Adams et al., Science , 252: 
1651-1656 (1991) ] . As far as can be determined, the people who 
generated these ESTs lack any knowledge of what protein (s) they 
encode. One cDNA, designated HSA09H122, contained 250 b.p. with 
7 unknown/ incorrect base pairs (97% homology) versus 5A-1 over 
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the same region. HSA09H122 was generated in France (Genethon, 
B.P. 60, 91002 Evry Cedex France) from a human lymphoblast cDNA 
library. The other EST, designated EST04033, contained 155 b.p. 
with 12 unknown/ incorrect base pairs (92% homology) versus 5A-1 
over the same region. EST0403 3 was generated at the Institute 
for Genomic Research (Gaithersburg, MD) from a human fetal brain 
cDNA clone (HFBDP28) . Thus, both of these ESTs are short DNA 
sequences and contain a number of errors (typical of single- 
stranded sequencing procedures as used when randomly screening 
ESTs) . 

Based on the BLASTN search, the owner of HSA09H122 was 
contacted in an effort to obtain that clone. The current owner 
of the clone appears to be Dr. Charles Auffret (Paul Brousse 
Hospital, Genetique, B.P. 8, 94801 Villejuif Cedex, France). Dr. 
Auffret indicated by telephone that his clone came from a lot of 
clones believed to be contaminated with yeast DNA, and he did not 
trust it for release. Contamination with yeast DNA of that clone 
was later confirmed to have been reported within an Internet 
database. Thus, HSA09H122 was not reliable. 

The other partial clone (EST04033) was purchased from 
American Type Culture Collection in Rockville, MD (ATCC Catalog 
no. 82815) . A telephone call to the Institute for Genomic 
Research revealed that it had been deposited at ATCC under 
[insert terms]. As far as can be determined, the present 
inventors were the first to completely sequence EST04033. The 
complete size of EST04033 was 3389 b.p. (SEQ ID No. 1) , with a 
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3,318 b.p. nonplasmid insert (see SEQ ID No. 3). Within this 
sequence of EST04033 the remaining 783 base pairs of the coding 
sequence presumed for a 70 kDa imidazoline receptor were 
predicted at the 5' side of 5A-1 (i.e., 783 coding nucleotides 
unique to EST04033 + 1171 coding nucleotides of 5A-1 = 1954 
predicted total coding nucleotides; assuming b.p.# 1397-1400 in 
SEQ. No. 1 encodes the initiating methionine) . The entire 1954 
b.p. coding region for an ~ 7 0 kDa protein is shown in SEQ ID No. 
2. The nucleotide sequence of EST04033 was determined in the 
same manner as described previously for the 5A-1 clone. The 
nucleotide sequence of the entire clone is shown in SEQ ID No. 1. 
In this sequence, an identical overlap was observed for the 
sequence obtained previously for the 5A-1 clone and the sequence 
obtained for EST04033. The 5A-1 overlap began at EST04033 b.p. 
2,181 (SEQ. No.l) and continued to the end of the molecule (b.p. 
3,351) . 

Conclusions About Our cDNA Clones 

cDNA of the present invention encode a protein that is 
immunoreactive with both of the known selective antisera for an 
imidazoline receptor, i.e., Reis antiserum and Dontenwill 
antiserum. Thus, an instant cDNA molecule produces a protein 
immunologically related to a purified imidazoline receptor and 
has the antigenic specificity expected for an imidazoline binding 
site. These antisera have been documented in the scientific 
literature as being selective for an "imidazoline receptor", 
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which provides strong evidence that such an imidazoline receptor 
has indeed been cloned. 

As mentioned, our instant cDNA sequence contains open 
reading frame distinct from any previously described proteins. 
Therefore, the encoded protein is novel, and it is unrelated to 
a 2 -adrenoceptors or monoamine oxidases. Small hydrophobic 
domains in the predicted amino acid sequence suggest that the 
protein is probably membrane bound,, as expected for an 
imidazoline receptor. 

Example 3. Cloning of a Human Gene 

A pre-made genomic library of human placental DNA was 
purchased from Stratagene (La Jolla, CA) to screen for an IR gene 
by hybridization. The genomic library was constructed in 
Stratagene's vector X FIX® II (catalog # 946206), and it was 
grown in XLl-Blue MRA (P2) host bacteria. It was titered to 
yield approximately 50,000 plaques per 137 mm plate. Lifts from 
six such plates were screened in duplicate by hybridization. 
The DNA probe used for screening was a 1.85 kb EcoRl fragment 
from EST 04 03 3 cDNA (uniquely related to our sequences based on 
the BLASTN) . After the restriction digestion of EST 04033 DNA, 
the 1.85 kb fragment was extracted from an agarose 
electrophoresis gel, cleaned according to the GENECLEAN® III kit 
manual (BIO 101, Inc., P.O. Box 2284, La Jolla, CA) , and 
radiolabeled with [a- 32 P]d-CTP according to Stratagene's Prime-It® 
II Random Primer Labeling Kit manual. Plaques were lifted onto 
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137 mm Duralon-UV™ membranes (Stratagene ' s catalog #420102), 
denatured, and cross-linked with Stratgene's UV-Stratalinker™ 
1800. Hybridization was conducted under high stringency 
conditions: prehybridization = 6 X SSC, 1 % SDS, 50 % formamide, 
and 100 ljug/ml of sheared, denatured salmon sperm DNA at 42°C for 
2 hrs; hybridization = 6 X SSC, 1 % SDS, 50 % formamide, and 100 
jug/ml of sheared, denatured salmon sperm DNA at 45°C overnight; 
wash = 2 washes of 1 X SSC, 0.1 % SDS at 65°C and 3 washes of 0.2 
X SSC, 0.2 % SDS at 65°C. This hybridization procedure is 
essentially described in Stratagene ' s vector X FIX® II 
instruction manual. Positive plaques were localized by 
developing Kodak BioMax films. Two positive genomic clones of 
identical size were retained through three rounds of screening. 

One of the positive genomic clones (designated JEP 1-A) was 
selected for complete characterization. It was found to contain 
an « 17 kb insert. Large-scale preparations of this genomic 
clone DNA were performed using the X QUICK" SPIN kit (BIO101, La 
Jolla, CA) . To verify that we had cloned a gene corresponding to 
5A-1 and EST04033 cDNA, some restriction site positions in the 
genomic clone were determined using the FLASH Nonradioactive Gene 
Mapping Kit (Stratagene) and compared to Southern blots of human 
DNA. The location of genomic sequences highly related to (or 
identical to) those of our cDNA clones was determined by high 
stringency hybridization (as above) with the following 32 P-labeled 
probe: a 1110 bp Apal-.E'coRI fragment from the cDNA clone 5A-1. 
This fragment was chosen as the probe because it lacks the GAG 
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repeat (encoding glutamic acids) , which might have complicated 
matters if it were found to be repeated elsewhere in the genome. 
With genomic clone JEP1-A, we detected a 14.1 kb EcoRl fragment 
and a 7.7 kb Sacl fragment that hybridized with this probe. 
Southern blots containing EcoRl- or Sacl-digested human genomic 
DNA (from human blood) with the 1110 bp Apal-EcoRI cDNA probe 
also resulted in the detection of a 14.1 kb EcoRl fragment and a 
7.7 kb SacT fragment. No additional restriction fragments of 
human genomic DNA appeared to hybridize with this probe under 
lower stringency conditions. These results strongly suggested 
that this gene (JEP-1A) encodes transcript (s) giving rise to the 
5A-1 and EST04 03 3 cDNA clones. Clone JEP-1A has been deposited 
under the Budapest Treaty with the American Type Culture 
Collection (ATCC) , 12301 Parklawn Drive, Rockville, MD, USA, 
20852, on August 28, 1997 and has been assigned deposit accession 
no. ATCC 209216. 

Genomic DNA sequencing was done by contract with Cadus 
Pharmaceutical Corporation (Tarrytown, NY) . The original lambda 
JEP1-A clone was subcloned into pZero (Invitrogen) as a 
convenient vector. The initial fragments for sequencing were 
derived from Sac I and Xba I restriction enzymes. The short Sac 
I fragments of 1.5, 3.0 and 3.5 kb were further digested with 
Hind III, Pst I, and Kpn I yielding 15 subclones of varying 
length. The procedure consisted of sequencing all these 
subclones and parent clones with vector forward and reverse 
primers. Subsequently, this initial round of sequencing was 
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supplemented with primer walking using custom oligonucleotides. 
The Sac I fragments were joined together by primer walking using 
the 2 Xba I fragments of 3 and 10 Kb. Then, the largest Sac I 
fragment (8 kb) and the 10 kb Xba I fragment were used as 
templates for a transposon seguencing method. The method used was 
the Primer Island Transposition Kit (Perkin-Elmer Corp. , Norwalk, 
CT; Applied Biosystems) (ABI) . The kit consists of a synthetic 
transposon (Tyl) containing forward and reverse primers and the 
integrase enzyme which inserts the transposon randomly into the 
target plasmid DNA. Transposon insertion is an alternative to 
subcloning or primer walking when seguencing a large region of 
DNA (Devine and Boeke, Nucleic Acids Res. 22: 3765-3772 (1994); 
Devine et al. , Genome Res., in press, (1997); Kimmel et al., In 
Genome Analysis, a Laboratory Manual, Cold Spring Harbor Press, 
NY, NY, in press (1997) . A total of over 250 individual 
seguencing reactions were performed. Seguencing was done on ABI 
model 3 73 and 377 automated sequencers using ABI dye-terminator 
sequencing kits. Primers were designed using Gene Runner 
software (Hastings Software, Hastings On Hudson, NY) . 
Oligonucleotides were purchased from Gibco-BRL (Gaithersburg, 
MD) . Sequence assembly was performed using Sequencer Software 
(Gene Codes Corp., Ann Arbor, MI) from 4-fold redundancy of 
sequences. 

The entire sequence of our JEP-1A genomic clone is shown in 
SEQ. 21. The computer program, GEN SCAN 1.0, was able to identify 
splice sites of known topology. As expected, this gene contained 
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a number of introns. See Table l hereinbelow. Only one 
continuous open reading frame was identified within our genomic 
clone. This open reading frame was interrupted by a number of 
introns (which is typical of eukaryotic transcripts) as shown in 
Fig. 5. The predicted polypeptide is encoded by the genomic DNA 
beginning at b.p. # 971 of SEQ ID No. 21. The predicted amino 
acid sequence of the polypeptide encoded thereby is shown in SEQ 
ID No. 22. As can be seen, the entire 5A-1 DNA and polypeptide 
sequence was encapsulated within this predicted genomic 
transcript. Therefore, there is no question that this is the 
gene encoding 5A-1 and EST04033 cDNA. In addition, JEP-1A has 
more nearly defined the full-length transcript (by at least 102 
more coding nucleotides than the cDNAs alone) . 

TABLE 1 

Position of Predicted . Introns and Exons 
GEN SCAN 1.0 Date run: 2 6-Aug-97 Time: 12:35:39 

Sequence gs_seqfile : 15202 bp : 58.3 6% C+G : Isochore 4 (57.00 - 
100.00 C+G%) 

Parameter matrix: Humanlso . smat 
Predicted genes/exons: 

Gn.Ex Type S .Begin ...End . Len Fr Ph I/Ac Do/T CodRg P.. Tscr. . 



1 


01 


Intr 


+ 


971 


1084 


114 


1 


0 


69 


98 


200 


0 


836 


20 


.91 


1 


02 


Intr 


+ 


4096 


4177 


82 


0 


1 


37 


53 


81 


0 


3 58 


-0. 


13 


1 


03 


Intr 


+ 


5732 


5856 


125 


0 


2 


117 


95 


84 


0 


953 


13 


.48 


1 


04 


Intr 


+ 


6997 


7046 


50 


0 


2 


95 


116 


44 


0 


998 


6 


.52 


1 


05 


Intr 


+ 


8416 


9825 


1410 


1 


0 


96 


94 


2914 


0 


970 


283 


.09 


1 


06 


Intr 


+ 


10489 


10897 


409 


1 


1 


15 


59 


318 


0 


517 


17 


. 19 


1 


07 


Intr 


+ 


11293 


11449 


157 


0 


1 


57 


61 


236 


0 


998 


18 


.57 


1 


08 


Intr 


+ 


11923 


12051 


129 


2 


0 


84 


63 


224 


0 


997 


21 


. 34 


1. 


09 


Intr 


+ 


12570 


12731 


162 


1 


0 


95 


80 


229 


0 


996 


23 


.94 


1 


10 


Term 


+ 


13090 


13700 


611 


2 


2 


59 


41 


1012 


0 


942 


89 


.44 


1. 


11 


PlyA 


+ 


14257 


14262 


6 
















1 


. 05 
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A BLASTN analysis of the entire genomic sequence (on 
08/26/97) demonstrated again that this gene has not been 
previously defined in the literature. 

As with the cDNA clones, some EST sequences of identity were 
found (listed below and later) . Of particular interest was a 
variance in the first intron splice site predicted by the 
computer. Upstream of that site (ie., upstream of amino acids 
PEKKGGE = amino acids predicted after first splice site) we have 
identified two types of transcripts. Genomic clone JEP-1A 
predicted 3 4 amino acids upstream of that sequence before 
entering another intron upstream. In an identical manner, three 
ESTs (H61282, AA428790 and AA428250) overlapped that entire 
region in our clones and they contained the identical nucleotides 
for those 34 amino acids, plus an additional 2 2 more amino acids 
further upstream. ^ By comparison, however, our EST04033 varied 
from all of these ESTs upstream of that site. This means, the 
first 1,532 nucleotides of EST04033 (thought to encode 
translation of amino acids 1-56 of EST04033 beginning at b.p. 
1,398 in SEQ. 1) are completely at variance with the other ESTs 
down to that splice site, but from there on they are identical. 
This provides strong evidence that this site can generate two 
alternatively spliced transcripts which can produce at least one 
functional protein (ie., the transf ections with EST04033). For 
the reader's information, this splice site is upstream of b.p. # 
1,565 in SEQ.l, b.p. # 168 in SEQ. 2, b.p. # 1,532 in SEQ. 3, amino 
acid # 57 in SEQ. 5, and b.p. # 971 in the genomic SEQ. 21. 
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Genomic Sequence Analysis 

Of interest is a unique glutamic- and aspartic acid-rich 
region within our predicted protein. This region of the IR 
protein delineates a highly unique span of 59 amino acids, 3 6 of 
which are Glu or Asp residues (61%) . This region was largely 
discovered within clone 5A-1 and it is present within all 
discovered and predicted transcripts from the gene (EST04033 
included) . This sequence lies between two potential 
transmembrane loops (hydrophobic domains) . The identification 
of this unique Glu/Asp-rich domain within our clones is 
consistent with an expected negatively charged pocket capable of 
binding clonidine and agmatine, both of which are highly 
positively charged ligands. Also, since the Dontenwill antiserum 
was specifically developed against an idazoxan/clonidine binding 
site, and its immunoreactivity is directed against the clone 
5A-1/Xgtll fusion protein, this suggests that clone 5A-1 might 
encode an imidazoline binding site. Furthermore, this glu/asp- 
rich sequence is located within the longest stretch of homology 
that the clone has with any known protein, i.e., the ryanodine 
receptor (as determined by on BLASTN) . Specifically, we have 
discovered four regions of homology between the imidazoline 
receptor and the ryanodine receptor, which are all Glu/Asp-rich. 
The total nucleic acid homology is 67% with the ryanodine 
receptor DNA over the stretches encompassing this region. 
However, this is not sufficient to indicate that the imidazoline 
receptor is a subtype of the ryanodine receptor, because this 
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homologous stretch is still a minor portion of the overall 
transcript (s) identified in the gene. Instead, this significant 
homology may reflect a commonality in function between this 
region of the IR and the ryanodine receptor. 

The Glu/Asp-rich region within the ryanodine receptor has 
also been reported to define a calcium and ruthenium .red dye 
binding domain that modulates the ryanodine receptor/Ca ++ release 
channel located within the sarcoplasmic reticulum. The only 
other charged amino acids within the Glu/Asp-rich region of our 
clones are two arginines (the ryanodine receptor has uncharged 
amino acids at the corresponding positions) . 

Based on this identification of Arg residues within the 
Glu/Asp-rich region of the predicted imidazoline binding site, 
the assistance of Dr. Paul Ernsberger (Case Western Reserve 
University, Cleveland, Ohio) was enlisted. Dr. Ernsberger 
performed phenylglyoxal attack of arginine on native PC-12 
membranes. Dr. Ernsberger was able to demonstrate that this 
treatment completely eliminated imidazoline binding sites in 
these membranes. This provides some indirect evidence that the 
native imidazoline binding site also contains an Arg residue. On 
the other hand, attempts to chemically modify cysteine and 
tyrosine residues, which are not located near the Glu/Asp-rich 
region did not affect PC-12 membrane binding of imidazolines. 

As a further test of the sequence, it was determined whether 
native IR binding sites in PC-12 cells would be sensitive to 
ruthenium red. From the structure of the cloned sequence, it was 
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reasoned that native IR should bind ruthenium red. Accordingly, 
a competition of ruthenium red with 125 PIC at native IR sites on 
PC-12 membranes was studied. In these studies it was observed 
that ruthenium red competed for I25 PIC binding to the same extent 
as did the potent imidazoline compound, moxonidine, i.e., 100% 
competition. Furthermore, the IC 50 for competition of ruthenium 
red at IR was slightly more robust than reported for ruthenium 
red on the activation of calcium-dependent cyclic nucleotide 
phosphodiesterase - the previous most potent effect of ruthenium 
red on any biological site - indicating possible pharmacological 
importance. It is also noteworthy that calcium failed to compete 
for 125 PIC binding at PC-12 IR sites (as did a calcium substitute, 
lanthanum) . We and others have previously reported that a number 
of other cations robustly interfere with IR binding [Ernsberger 
et al., Annals NY Acad.Sci. , 763: 22-42 (1995); Ernsberger et 
al., Annals NY Acad.Sci. , 763: 163-168 (1995)]. Attempts were 
also made to directly stain the proteins in SDS gels with 
ruthenium red [Chen and MacLennan, J. Biol. Chem. , 269: 22698- 
22704 (1994)]. It was found that ruthenium red stains the same 
platelet (33 kDa) and brain (85 kDa) bands that Reis antiserum 
detects. (Remember, the same 3 3 kDa band was verified to 
directly correlate with m PIC Bmax values for IR. ) Thus, these 
results linked the attributes predicted from the cloned sequence 
to a native IR binding site. 

Two other facets of the predicted polypeptide from JEP-1A 
suggest that we have identified most of the functional sequences. 
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First, our predicted protein is comparable in regard to both the 
order and size of three regions of importance to the function of 
the interleukin-2RjS receptor (IL-2R/3) . Specifically, IL-2R/3 
possesses the following regions over a span of 286 amino acids: 
ser-rich region, followed by glu/asp-rich region, followed by 
proline-rich region. Likewise, our predicted protein has the 
same three regions, in the same order, over a span of about 625 
amino acids. This suggests that our protein might function 
similarly as cytokine receptors. Secondly, our predicted protein 
possesses a cytochrome p4 50 heme-iron ligand signature sequence 
[Nelson et al., Pharmacogenetics 6: 1-42 (1996)]. This suggests 
that our protein might also function as do cytochrome p450s in 
oxidative, peroxidative and reductive metabolism of endogenous 
compounds . 

Some additional findings about the amino acid sequence of 
our instant IR polypeptide are: (1) The glu/asp-rich region also 
bears similarity to an amino acid sequence within a GTPase 
activator protein. (2) There appear to be four small hydrophobic 
domains indicative of transmembrane domain receptors. (3) A 
number of potential protein kinase C (PKC) phosphorylation sites 
appear near to the carboxy side of the protein, and we have 
previously found that treatment of membranes with PKC leads to an 
enhancement of native IR binding. Thus, these observations are 
all consistent with other observations expected for native IR. 

RNA Studies 
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Northern blotting has also been performed on polyA + mRNA 
from human tissues in order to ascertain the regional expression 
of the mRNA corresponding to our cDNA. The same 1110 b.p. Apal- 
EcoRl fragment from cDNA clone 5A-1 used in Southern blots was 
used for- these studies. This probe region was not found within 
any other known sequences on the BLASTN database. The results 
revealed a 6 kb mRNA band, which predominated over a much fainter 
9.5 kb mRNA in most regions (Fig. 6). Some exceptions to this 
pattern were in lymph nodes and cerebellum (Fig. 6) , where the 
9.5 kb band was equally or more intense. Although the 6 kb band 
is weakly detectable in some non-CNS tissues, it is enriched in 
brain. An enrichment of the 6 kb mRNA is observed in brainstem, 
although not exclusively. The regional distribution of the mRNA 
is somewhat in keeping with the reported distribution of IR 
binding sites, when extrapolated across species (Fig. 6) . Thus, 
the rank order of Bmax values for IR in rat brain has been 
reported to be frontal cortex > hippocampus > medulla oblongata > 
cerebellum [Kamisaki et al., Brain Res. . 514: 15-21 (1990)]. 
Therefore, with the exception of human cerebellum, which showed 
two mRNA bands, the distribution of the mRNA for our the present 
cloned cDNA is consistent with it belonging to IR. 
[It should be noted that while IR binding sites are commonly 
considered to be low in cerebral cortex compared to brainstem, 
this is in fact a misinterpretation of the literature based only 
on comparisons to the alpha-2 adrenoceptor ' s Bmax, rather than on 
absolute values. Thus, IR Bmax values have actually been 
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reported to be slightly higher in the cortex than the brainstem, 
but they only "appear" to be low in the cortex in comparison to 
the abundance of alpha-2 binding sites in cortex. Therefore, the 
distribution of the IR mRNA is reasonably in keeping with the 
actual Bmax values for radioligand binding to the receptor 
[Kamisaki et al., (1990)]. 

A final point to emphasize about the Northern blots is that 
they clearly demonstrate two high-stringency transcripts (Fig. 
6) . This is in keeping with the alternatively spliced EST cDNAs 
mentioned earlier. Thus, we suggest this may be the basis for 
both the 6 and 9.5 kb transcripts. 

Summary of Genomic Sequence Results 

The JEP-1A clone clearly contains most of the gene. Within 
it we have identified at least 3,776 nucleotides for 
transcript (s) (encoding 1,065 amino acids plus 587 b.p. of 
untranslated region down to the polyT + tail) . This has been 
lengthened by at least 66 coding nucleotides upstream (-22 amino 
acids) in comparison to overlapping ESTs. In addition to this, 
we are quite confident of the splice site for the two observed 
mRNA sizes. Most of the functional sequences are predicted to be 
encoded within our genomic clone. 

A summary of the evidence that a gene encoding an 
imidazoline receptor protein has been cloned is summarized in 
Table 2 hereinbelow. 
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TABLE 2 

Comparison of Protein Predicted From Our Clones with 



Properties of Native IRj and I 2 Sites 



Imidazoline Receptor- 
like Clone 


Authentic IR, 


Authentic I 2 


Original X phage fusion 
protein (from 5A-1) is 
immunoreactive with 
Dontenwill and Reis 
antibodies 


Dontenwill-Ab activity 
(a) inhibits RVLM IR, 
binding ( 3 H-Clonidine) , 
& (b) correlates with 
85 kDa Western band. 
Reis-Ab activity 
correlates w platelet 
IR, Bmax ( l25 PIC binding) 


Dontenwill & Reis Abs 
both inhibit brain I 2 
sites ( 3 H-IDX). 


Segment homologous to a 
GTPase-activator prot'n 


Weak to moderate 
sensitivity to GTP 


Not sensitive to GTP 


Predicts "> 120,000 MW 
protein 


85,000 MW 
immunoreactivity 


59-61,000 MW 
photoaf f inity 


Predicts 1-4 
hydrophobic domains 


Enriched in plasma 
membranes 


Enriched in 
mitochondria 


Encodes Glu/Asp-rich 
(negatively charged) 
domain consistent with 
Ca ++ and ruthenium red 
binding 


• Binds (+) -charged 
imidazolines 

• Sensitive to 
divalent cations 

• Sensitive to 
ruthenium red 


• Binds (+) -charged 
imidazolines 

• Not sensitive to 
divalent cations 

• Unknown 
sensitivity for 
Ruthen. red 


Arginine is only 
positively charged 
amino acid near Glu/Asp 


• Arg attack 
eliminates 
binding 

• Cys & Tyr attack 
w/o effect on 
binding 


Unknown 


Encodes PKC sites 


PKC treatment enhances 
binding 


Unknown 


Human mRNA 

Distribution; F . Cortex 
> hippocampus > medulla 


Rat IR, Bmax ( m PIC) : 
F. Cortex > hippocampus 
> medulla 


Rat I 2 Bmax ( 3 H-IDX) : 
Medulla > F. Cortex 


Transfected COS-7 cells 
expressed high affinity 
for moxonidine S 
p-iodoclonidine (PIC) 


High affinity for 
moxonidine and PIC 


Low affinity for 
moxonidine and PIC 
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Example 4. Transient Transfection Studies 

COS-7 cells were transfected with a vector containing 
EST04033 cDNA, which was predicted based on sequence analysis to 
contain the glu/asp rich region thought to be important for 
ligand binding to the imidazoline receptor protein. The EST04 03 3 
cDNA was subcloned into pSVK3 (Pharmacia LKB Biotechnology, 
Piscataway, NJ) using standard techniques [Sambrook, supra], and 
transfected via the DEAE-dextran technique as previously 
described [Choudhary et al., Mol . Pharmacol . , 42: 627-633 (1992); 
Choudhary et al., Mol . Pharmacol . , 43: 557-561 (1993); Kohen et 
al., J.Neurochem. , 66: 47-56 (1996)]. A restriction map of the 
EST04033 cDNA is shown in Fig. 3. The restriction enzymes Sal I 
and Xba I were used for subcloning into pSVK3 . 

Briefly stated, COS-7 cells were seeded at 3 x 10 6 cells/100 
mm plate, grown overnight and exposed to 2 ml of DEAE- 
dextran/plasmid mixture. After a 10-15 min. exposure, 20 ml of 
complete medium (10% fetal calf serum; 5 nq/ml streptomycin; 100 
units/ml penicillin, high glucose, Dulbeccos' modified Eagle's 
medium) containing 80 juM chloroquine was added and the incubation 
continued for 2.5 hr. at 3 7 °C in a 5% C0 2 incubator. The mixture 
was then aspirated and 10 ml of complete medium containing 10% 
dimethyl sulfoxide was added with shaking for 150 seconds. 

Following aspiration, 15 ml of complete medium with dialyzed 
serum was added and the incubation continued for an additional 65 
hours. After this time period, the cells from 6 plates were 
harvested and membranes were prepared as previously described 
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[Ernsberger et al., Annals NY Acad. Sci . , 763: 22-42 (1995), the 
disclosure of which is incorporated herein by reference] . 
Parent, untransf ected COS-7 cells were prepared as a negative 
control. Some membranes were treated with and without PKC for 2 
hrs prior to analysis, since previous studies had indicated that 
receptor phosphorylation could be beneficial to detect IR 
binding. 

Transf ected samples were also analyzed by Western blots. 
The protocol used for Western blot assay of transfected cells is 
as follows. Cell membranes were prepared in a special cocktail 
of protease inhibitors (1 mM EDTA, 0.1 nM EGTA, 1 mM 
phenylmethyl-suf onylf luoride, 10 mM e-aminocaproic acid, 0 . 1 mM 
benzamide, 0 . 1 mM benzamide-HCl , 0.1 mM phenanthroline , 10 ^Ltg/ml 
pepstatin A, 5 mM iodoacetamide, 10 jug/ml antipain, 10 /ig/ml 
trypsin-chymotrypsin inhibitor, 10 jug/ml leupeptin, and 1.67 
jLtg/ml calpain inhibitor) in 0.25 M sucrose, 1 mM MgCl 2 , 5 mM 
Tris, pH 7.4. Fifteen jug of total protein were denatured and 
separated by SDS gel electrophoresis. Gels were equilibrated and 
electrotransf erred to nitrocellulose membranes. Blots were then 
blocked with 10% milk in Tris-buf f ered saline with 0.1% Tween-20 
(TBST) during 60 min. of gentle rocking. Afterwards, blots were 
incubated in anti-imidazoline receptor antiserum (1:3000 dil.) 
for 2 hours. Following the primary antibody, blots were washed 
and incubated with horseradish peroxidase-con jugated anti-rabbit 
goat IgG (1:3000 dil.) for 1 hr. Blots were extensively washed 
and incubated for 1 min. in a 1:1 mix of Amersham ECL detection 
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solution. The blots were wrapped in cling-film (SARAN WRAP) and 
exposed to Hyperf ilm-ECL (Amersham) for 2 minutes. Quantitation 
was based on densitometry using a standard curve of known amounts 
of protein containing BAC membranes or platelet membranes run in 
each gel. 

One nM [ 125 I ] p-iodoclonidine was employed in the radioligand 
binding competition assays, since at this low concentration this 
radioligand is selective for the IR site much more than for I 2 
binding sites. The critical processes of membrane preparation of 
tissue culture cells and the radioligand binding assays of IR and 
I 2 have been reviewed by Piletz and colleagues [Ernsberger et 
al., Annals NY Acad Sci. . 763: 510-519 (1995)]. Total binding (n 
= 12 per experiment) was determined in the absence of added 
competitive ligands and nonspecific binding was determined in the 
presence of 10" 4 M moxonidine (n = 6 per experiment) . Log normal 
competition curves were generated against unlabeled moxonidine, 
p-iodoclonidine, and (-) epinephrine. Each concentration of the 
competitors was determined in triplicate and the experiment was 
repeated thrice. 

The protocol to fully characterize radioligand binding in 
the transfected cells entails the following. First, the presence 
of IR and/or I 2 binding sites are scanned over a range of protein 
concentrations using a single concentration of [ 125 I]-p- 
iodoclonidine (l.OnM) and 3 H-idazoxan (8nM) , respectively. Then, 
rate of association binding experiments (under a 10 juM mask of NE 
to remove a 2 AR interference) are performed to determine if the 

46 




kinetic parameters are similar to those reported for native 
imidazoline receptors [Ernsberger et al. Annals NY Acad. Sci. , 
763: 163-168 (1995)]. Then, full Scatchard plots of [ 125 I]-p- 
iodoclonidine (2-20 nM if like IR) and 3 H-idazoxan (5-60 nM if 
like I 2 ) binding are conducted under a 10 juM mask of NE. Total 
NE (10 juM) -displaceable binding is ascertained as a control to 
rule out a 2 -adrenergic binding. The Bmax and K D parameters for 
the transfected cells are ascertained by computer modeling using 
the LIGAND program [McPherson, G . , J . Pharmacol . Meth . , 14: 213-228 
(1985)] using 20 /jM moxonidine to define .IR nonspecific binding, 
or 20 juM cirazoline to define I 2 nonspecific binding. 

The results of the transient transfection experiments of the 
imidazoline receptor vector into COS-7 cells are shown in Fig. 4. 
Competition binding experiments were performed using membrane 
preparations from these cells and n5 PIC was used to radiolabel IR 
sites. A mask of 10 juM norepinephrine was used to rule out any 
possible a 2 AR binding in each assay even though parent COS-7 
cells lacked any a 2 AR sites. Moxonidine and p-iodoclondine (PIC) 
were the compounds tested for their affinity to the membranes of 
transfected cells. As can be seen, the affinities of these 
compounds in competition with ,25 PIC were well within the high 
affinity (nM) range. 

The following IC 50 values and Hill slopes were obtained in 
this study: moxonidine, IC 50 = 45.1 nM (Hill slope = 0.35 ± 
0.04); p-iodoclonidine without PKC pretreatment of the membranes, 
IC 50 = 2.3 nM (Hill slope = 0.42 ± 0.06); p-iodoclonidine with PKC 
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pretreatraent of the membranes, IC 50 = 19.0 nM (Hill slope = 0.48 ± 
0.08). Shallow Hill slopes for [ 125 I jp-iodoclonidine have been 
reported before in studies of the interaction of moxonidine and 
p-iodoclonidine with the human platelet IR t binding site [Piletz 
and Sletten, (1993)]. Epinephrine failed to displace any of the 
[ i25 I] p-iodoclonidine binding in the transfected cells, as 
expected since this is a nonadrenergic imidazoline receptor. 
Furthermore, in untransf ected cells less than 5% of the amount of 
displaceable binding was observed as for the transfected cells - 
and this "noise" in the parent cells all appeared to be low 
affinity (data not shown) . These results thus demonstrate the 
high. affinities of two imidazoline compounds, p-iodoclonidine and 
moxonidine, for the portion of our cloned receptor encoded within 
EST04033. PKC pretreatraent of the membranes had no effect in the 
transfected COS cells. 

It was also observed that the level of the expressed 
protein, as measured by Western blotting of the transfected 
cells, was consistent with the level of IR binding that was 
detected. In other words, a protein band was uniquely detected 
in the transfected cells, and it was of a density consistent with 
the amount of radioligand binding. Hence, the present results 
are in keeping with those expected for an imidazoline receptor. 
In summary, these data provide direct evidence that the EST04033 
clone encodes an imidazoline binding site having high affinities 
for moxonidine and p-iodoclonidine, which is expected for an IR 
protein. 
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Example 5. Stable Transf ection Methods . 

Stable transf ections can be obtained by subcloning the 
imidazoline receptor cDNA into a suitable expression vector, 
e.g., pRc/CMV (Invitrogen, San Diego, CA) , which can then be used 
to transform host cells, e.g. CHO and HEK-293 cells, using the 
Lipofectin reagent (Gibco/BRL, Gaithersburg, MD) according to the 
manufacturer's instructions. These two host cell lines can be 
used to increase the permanence of expression of an instant 
clone. The inventors have previously ascertained that parent CHO 
cells lack both alpha 2 -adrenoceptor and IR binding sites [Piletz 
et a1 -, J. Pharm.S Exper. Ther. . 272: 581-587 (1995)], making 
them useful for these studies. Twenty-four hours after 
transf ection, cells are split into culture dishes and grown in 
the presence of 600 jug/ml G418-supplemented complete medium 
(Gibco/BRL) . The medium is changed every 3 days and clones 
surviving in G418 are isolated and expanded for further 
investigation . 

Example 6 . Direct Cloning of More Complete Gene and Other 
Homologous Human IR. 

Direct probing of other human genomic and cDNA libraries can 
be performed by preparing labelled cDNA probes from different 
subcloned regions of our clone. Commercially available human DNA 
libraries can be used. Besides the cDNA and genomic libraries we 
have already screened, another genomic library is EMBL 
(Clontech) , which integrates genomic fragments up to 2 2 kbp long. 
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It is reasonable to expect that introns may exist within other 
human IR genes so that only by obtaining overlapping clones can 
the full-length genes be sequenced. A probe encompassing the 5' 
end of an instant cDNA is generally useful to obtain the gene 
promoter region. Clontech's Human PromoterFinder DNA Walking 
procedure provides a method for "walking" upstream or downstream 
from cloned sequences such as cDNAs into adjacent genomic DNA. 

Example 7. Methods for Preparing Antibodies to Imidazoline 
Receptive Proteins . 

An instant imidazoline receptive polypeptide can also be 
used to prepare antibodies immunoreactive therewith. Thus, 
synthetic peptides (based on deduced amino acid sequences from 
the DNA) can be generated and used as immunogens. Additionally, 
transfected cell lines or other manipulations of the DNA sequence 
of an instant imidazoline receptor can provide a source of 
purified imidazoline receptor peptides in sufficient quantities 
for immunization, which can lead to a source of selective 
antibodies having potential commercial value. 

In addition, various kits for assaying imidazoline receptors 
can be developed that include either such antibodies or the 
purified imidazoline receptor protein. A purification protocol 
has already been published for the bovine imidazoline receptor in 
BAC cells [Wang et al, 1992] and an immunization protocol has 
also been published [Wang et al., 1993]. These same protocols 
can be utilized with little if any modification to afford 
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purified human IR protein from transfected cells and to yield 
selective antibodies thereto. 

In order to obtain antibodies to a subject peptide, the 
peptide may be linked to a suitable soluble carrier to which 
antibodies are unlikely to be encountered in human serum. 
Illustrative carriers include bovine serum albumin, keyhole 
limpet hemocyanin, and the like. The conjugated peptide is 
injected into a mouse, or other suitable animal, where an immune 
response is elicited. Monoclonal antibodies can be obtained from 
hybridomas formed by fusing spleen cells harvested from the 
animal and myeloma cells [see, e.g., Kohler and Milstein, Nature , 
256: 495-497 (1975)]. 

Once an antibody is prepared (either polyclonal or 
monoclonal) , procedures are well established in the literature, 
using other proteins, to develop either RIA or ELISA assays [see, 
e.g. , "Radioimmunoassay of Gut Regulatory Peptides; Methods in 
Laboratory Medicine,*' Vol. 2, chapters 1 and 2, Praeger 
Scientific Press, 1982]. In the case of RIA, the purified 
protein can also be radiolabelled and used as a radioactive 
antigen tracer. 

Currently available methods to assay imidazoline receptors 
are unsuitable for routine clinical use, and therefore the 
development of an assay kit in this manner could have significant 
market appeal. Suitable assay techniques can employ polyclonal 
or monoclonal antibodies, as has been previously described [U.S. 
Patent No. 4,376,110 (issued to David et al.) , the disclosure of 
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which is incorporated herein by reference] . 
Summary 

In summary, we have identified unique DNA sequences that have 
properties expected of a gene and the cDNA transcript (s) of an 
imidazoline receptor. Prior to our first cloning the cDNA, only 
two sequences of EST cDNA were identified within public databases 
having similar nature. But, these were both partial and 
imprecise sequences - not identified at all with respect to any 
encoded protein. Indeed, one of them (HSA09H122) was reported to 
be contaminated. In our hands, the other EST 04033 clone was 
correctly sequenced for the first time (in its entirety = 3318 
bp) . Prior to this, even the size of EST 04 03 3 was unknown. The 
present inventors also demonstrated that an imidazoline receptive 
site can be expressed in cells transfected with the EST 04033 
cDNA clone, and this site has the proper potencies of an IR. We 
have deduced most of the complete cDNA encoding this protein. 

The present invention has been described with reference to 
specific examples for purposes of clarity and explanation. 
Certain obvious modifications of the invention readily apparent 
to one skilled in the art can be practiced within the scope of 
the appended claims. 
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(i) APPLICANT: John E. Piletz 
Tina R. Ivanov 
(ii) TITLE OF INVENTION: DNA Sequence Encoding a Human 
Imidazoline Receptor 
(iii) NUMBER OF SEQUENCES: 22 
(iv) CORRESPONDENCE ADDRESS: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 



:tctagaac 


TAGTGGATCC 


CCCGGGCTGC 


AGGAATTCCA 


GTTTAATACT 


AACCCTAATG 


60 


5TGACTGCG 


GTTTACAAAG 


AGCTCTGTAT 


CACCTGGGAT 


AGCTTTCAGT 


AGCAATTCAC 


120 


\CAACTGGT 


CCTAAAAAAT 


AATAACAATA 


ATAATAATAA 


TTAGAGAATT 


AAAACCCAAC 


180 


3CATGTTGA 


ATGGTTAAAA 


TCACGTAAGA 


ACTGAAATTT 


GGGGTGGGGG 


TGTCCTCAAC 


240 


3CTGAGCTT 


GTCCTAGCAG 


TGAAAATGCT 


CGCCTCCAAG 


CAGGGCTCAG 


AAAGGTCTGG 


300 


5CCCTCCAG 


GCAGAGGGCT 


GAGCTCAGGG 


GGCTCTTGGA 


GGACACTCAC 


CCCATGGTCC 


360 


CGGGATGCT 


TCTGGCTTCC 


TTAAAAACAG 


TTGGGCATCC 


G CATTGT ATA 


AGTAGGTGGA 


420 


^CCCTAGTG 


TGGTTCTTTT 


GAAGGATATG 


GGAAGGGAGG 


ATGACGAACT 


AGAGAAGTGG 


480 


IGGGGACCA 


AAATCACTGA 


GGTCCCAGAA 


TATCATAGAT 


TTGGGTATAG 


GATTGGGGTC 


540 


:taagaatt 


GAGCACCAGG 


AATTCCAGCT 


TCTTCCCATT 


AAAGAAACTG 


GGACTGGTTT 


600 


;CGTTGGAG 


GCCTATGTAG 


TGTTTTCTGC 


CCCTGTCCCA 


TACCAAGTCT 


CATTGATATT 


660 


:t(§:agaat 


ATCAGATGAA 


AATCTATTTC 


TAAAGACCAT 


TGGGAGAATG 


GGTGGTGGAG 


720 


IG<||GTTGG 


AGTGGGGTTG 


GGGGGCAGTT 


AAAAATGAAT 


AAAAATCTCT 


CAGCTACAGA 


780 


:C?%AACAT 


CACTTCCCTC 


CGCATTCACA 


GCATTTCCCA 


GCAGTCCCCA 


GATGGTTGTT 


840 


:CGTGGGGA 


CACAGCAGCT 


GCCTCATTTC 


CCTTCAGGCC 


CCATGGGCTG 


CTGGTCAACC 


900 


:a<Satcta 


CTAAAGATGA 


CGCAAATGCC 


GACTGAACAA 


TCTGAAACCC 


AAAGGACTCG 


960 


;ga§agaca 


TGTTCTGCTG 


AGGAGAGAAA 


GGTGAGCCAA 


GGGCAGGGCC 


CAGGTCCCCC 


1020 


jggBgcccc 


CGAGAGCCCG 


GACATGCACC 


TTCTGGATGT 


GTTTGTTCAA 


GTAGGACTTA 


1080 


lGCGGAAGA 


AGCTCCCACA 


TTCAGGGCAT 


GGGTACTTCT 


TCTCCCCATC 


AGACTCCATT 


1140 


?GTTTTTGG 


GGACTGCCAT 


GTCGCAGGAG 


AAAGAGCCAT 


TGGCACTCTG 


CTTCTCTGGC 


1200 


?CTTCAGGT 


CGCTGGCATC 


TGAGAGGTCA 


CCATAGGAGT 


CAGAGCTCTC 


AATCGGATCC 


1260 


JATGTGAGC 


ATTTCTGGCC 


TTCTCGGTTA 


CAGATACTGC 


AGAAGTTGCT 


GGGCCCCTCG 


1320 


'GTGCTTCT 


TCAGGTGGTC 


TGCCATGTAT 


GCTGCCCGCA 


AGTACTTCCC 


ACACACCTGG 


1380 


lGGGCACCT 


TGTCTTC ATG ACA GGC CAG GTG GGA 


GCG CAG ACG GTC TCG 


1430 



Met Thr Gly Gin Val Gly Ala Gin Thr Val Ser 



2T GGC AAA AGA AGC ATT GCA GGT CTG ACA CTT GTG AGG CCG CTC AGA 1478 
Ly Gly Lys Arg Ser lie Ala Gly Leu Thr Leu Val Arg Pro Leu Arg 
15 20 25 

3T GTG CAC CTG CTT GAT ATG TCC GTT CAA GTG ATC AGG CCT GGA GAA 1526 
2r Val His Leu Leu Asp Met Ser Val Gin Val lie Arg Pro Gly Glu 
30 35 40 

:C JfTT CCC ACA GCT CTG GCA GAT GTA AGG CGG AAT TCC CCA GAG AAG 1574 
La-ffhe Pro Thr Ala Leu Ala Asp Val Arg Trp Asn Ser Pro Glu Lys 
^45 50 55 

\GWSGT GGT GAA GAC TCC CGG CTC TCA GCT GCC CCC TGC ATC AGA CCC 162 2 

^sMGly Gly Glu Asp Ser Trp Leu Ser Ala Ala Pro Cys lie Arg Pro 
) □ 65 70 75 

iC'^kGC TCC CCT CCC ACT GTG GCT CCC GCA TCT GCC TCC CTG CCC CAG 1670 
2r Ser Ser Pro Pro Thr Val Ala Pro Ala Ser Ala Ser Leu Pro Gin 
80 85 90 

:C ATC CTC TCT AAC CAA GGA ATC ATG TTC GTT CAG GAG GAG GCC CTG 1718 
ro lie Leu Ser Asn Gin Gly lie Met Phe Val Gin Glu Glu Ala Leu 
95 100 105 
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CC AGC AGC CTC TCG TCC ACT GAC AGT CTG ACT CCC GAG CAC CAG CCC 17 66 

.la Ser Ser Leu Ser Ser Thr Asp Ser Leu Thr Pro Glu His Gin Pro 
110 115 120 

TT GCC CAG GGA TGT TCT GAT TCC TTG GAG TCC ATC CCT GCG GGA CAG 1814 
le Ala Gin Gly Cys Ser Asp Ser Leu Glu Ser lie Pro Ala Gly Gin 
125 130 135 

CA GCT TCC GAT GAT TTA AGG GAC GTG CCA GGA GCT GTT 
la Ala Ser Asp Asp Leu Arg Asp Val Pro Gly Ala Val 
4Q = , 145 150 

Gd^CCA GAA CAT GCC GAG 
erOjPro Glu His Ala Glu 

y i6o 

AGPjATC ATC TTC CTG CCC TTC ACC TGC ATT GGC TAC ACG GCC, ACC AAT 1958 

Injille He Phe Leu Pro Phe Thr Cys He Gly Tyr Thr Ala Thr Asn 
175 180 185 

AG GAC TTC ATC CAG CGC CTG AGC ACA CTG ATC CGG CAG GCC ATC GAG 2 006 

In Asp Phe He Gin Arg Leu Ser Thr Leu lie Trp Gin Ala lie Glu 
190 195 200 

3G CAG CTG CCT GCC TGG ATC GAG GCT GCC AAC CAG CGG GAG GAG GGC 2 054 

rp Gin Leu Pro Ala Trp He Glu Ala Ala Asn Gin Trp Glu Glu Gly 

57 



GGT GGT GCA 18 62 

Gly Gly Ala 
155 



CCG GAG GTC CAG GTG GTG CCG GGG TCT GGC 1910 
Pro Glu Val Gin Val Val Pro Gly Ser Gly 
165 170 



KG GGT GAA CAG GGC GAG GAG GAG GAT GAG GAG GAG GAA GAA GAG GAG 2102 

In Gly Glu Gin Gly Glu Glu Glu Asp Glu Glu Glu Glu Glu Glu Glu 

20 225 230 235 

AC GTG GCT GAG AAC CGC TAC TTT GAA ATG GGG CCC CCA GAC GTG GAG 2150 
sp Val Ala Glu Asn Arg Tyr Phe Glu Met Gly Pro Pro Asp Val' Glu 
240 245 250 

AGppAG GAG GGA GGA GGC CAG GGG GAG GAA GAG GAG GAG GAA GAG GAG 2198 

lulfclu Glu Gly Gly Gly Gin Gly Glu Glu Glu Glu Glu Glu Glu Glu 
V 255 260 265 

kTW=GAA GAG GCC GAG GAG GAG CGC CTG GCT CTG GAA TGG GCC CTG GGC 2246 

splfGlu Glu Ala Glu Glu Glu Arg Leu Ala Leu Glu Trp Ala Leu Gly 
O 270 275 280 

CG^GAC GAG GAC TTC CTG CTG GAG CAC ATC CGC ATC CTC AAG GTG CTG 2294 

la Asp Glu Asp Phe Leu Leu Glu His lie Arg lie Leu Lys Val Leu 
285 290 295 

3G TGC TTC CTG ATC CAT GTG CAG GGC AGT ATC CGC CAG TTC GCC GCC 2342 

rp Cys Phe Leu lie His Val Gin Gly Ser lie Arg Gin Phe Ala Ala 

DO 305 310 315 
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GC CTT GTG CTC ACC GAC TTC GGC ATC GCA GTC TTC GAG ATC CCG CAC 
ys Leu Val Leu Thr Asp Phe Gly lie Ala Val Phe Glu lie Pro His 
320 325 330 

AG GAG TCT CGG GGC AGC AGC CAG CAC ATC CTC TCC TCC CTG CGC TTT 
In Glu Ser Trp Gly Ser Ser Gin His lie Leu Ser Ser Leu Arg Phe 
335 340 345 

1TC TTT TGC TTC CCG CAT GGC GAC CTC ACC GAG TTT GGC TTC CTC ATG 
'al Phe Cys Phe Pro His Gly Asp Leu Thr Glu Phe Gly Phe Leu Met 
350 355 360 

:C<ffGAG CTG TGT CTG GTG CTC AAG GTA CGG CAC AGT GAG AAC ACG CTC 
'rdjPGlu Leu Cys Leu Val Leu Lys Val Arg His Ser Glu Asn Thr Leu 
UJ365 370 375 

'TqPiATT ATC TCG GAC GCC GCC AAC CTG CAC GAG TTC CAC GCG GAC CTG 
'hqKlle lie Ser Asp Ala Ala Asn Leu His Glu Phe His Ala Asp Leu 
80^ 385 390 395 

:GC TCA TGC TTT GCA CCC CAG CAC ATG GCC ATG CTG TGT AGC CCC ATC 
org Ser Cys Phe Ala Pro Gin His Met Ala Met Leu Cys Ser Pro lie 
400 405 410 

:TC TAC GGC AGC CAC ACC AGC CTG CAG GAG TTC CTG CGC CAG CTG CTC 
ieu Tyr Gly Ser His Thr Ser Leu Gin Glu Phe Leu Arg Gin Leu Leu 



ZC TTC TAC AAG GTG GCT GGC GGC TGC CAG GAG CGC AGC CAG GGC TGC 2726 
nr Phe Tyr Lys Val Ala Gly Gly Cys Gin Glu Arg Ser Gin Gly Cys 
430 435 440 

TC CCC GTC TAC CTG GTC TAC AGT GAC AAG CGC ATG GTG CAG ACG GCC 2774 
ie Pro Val Tyr Leu Val Tyr Ser Asp Lys Arg Met Val Gin Thr Ala 
445 450 455 

:C GGG GAC TAC TCA GGC AAC ATC GAG TGG GCC AGC TGC ACA CTC TGT 2 822 

Lajjsiy Asp Tyr Ser Gly Asn He Glu Trp Ala Ser Cys Thr Leu Cys 
50 fT 465 470 475 

:A*GCC GTG CGG CGC TCC TGC TGC GCG CCC TCT GAG GCC GTC AAG TCC 2 87 0 

2r=;Ala Val Arg Arg Ser Cys Cys Ala Pro Ser Glu Ala Val Lys Ser 
Cj 480 485 490 

:cJjCC ATC CCC TAC TGG CTG TTG CTC ACG CCC CAG CAC CTC AAC GTC 2918 
La Ala He Pro Tyr Trp Leu Leu Leu Thr Pro Gin His Leu Asn Val 
495 500 505 

'C AAG GCC, GAC TTC AAC CCC ATG CCC AAC CGT GGC ACC CAC AAC TGT 2 966 

.e Lys Ala Asp Phe Asn Pro Met Pro Asn Arg Gly Thr His Asn Cys 
510 515 520 
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JC AAC CGC AAC AGC TTC AAG CTC AGC CGT GTG CCG CTC TCC ACC GTG 
:g Asn Arg Asn Ser PHe Lys Leu Ser Arg Val Pro Leu Ser Thr Val 
525 530 535 



?G CTG GAC CCC 
m Leu Asp Pro 

iT GGC CAC GTG 
;p Gly His Val 

"CfTTC GTG CTG 
.e#he Val Leu 

=P 575 

.G ffSTG CGG GCC 
n Jeu Arg Ala 

5! 590 

C CCC GGG ACG 
r Pro Gly Thr 
605 



ACA CGC AGC 
Thr Arg Ser 
545 

CTA GAG CTG 
Leu Glu Leu 
560 

CCC CAC GAG 
Pro His Glu 

TCG CTG CAG 
Ser Leu Gin 

GGA GGC AGC 
Gly Gly Ser 
610 



TGT ACC CAG CCT 
Cys Thr Gin Pro 
550 

CTC GTG GGG TAC 
Leu Val Gly Tyr 
565 

AAG TTC CAC TTC 
Lys Phe His Phe 
580 

GAC CTG AAG ACT 
Asp Leu Lys Thr 
595 

CCC CAG GGC TCC 
Pro Gin Gly Ser 



CGG GGC GCC 
Arg Gly Ala 

CGC TTT GTC 
Arg Phe Val 

CTG CGC GTC 
Leu Arg Val 
585 

GTG GTC ATC 
Val Val lie 
600 

TTT GCG GAT 
Phe Ala Asp 
615 



TTT GCT 3 062 

Phe Ala 
555 

ACT GCC 3110 

Thr Ala 

570 

TAC AAC 3158 
Tyr Asn 

GCC AAG 3206 
Ala Lys 

GGC CAG 3 2 54 

Gly Gin 



T GCC GAG CGC AGG GCC AGC AAT GAC 
o Ala Glu Arg Arg Ala Ser Asn Asp 



CAG CGT CCC CAG GAG GTC CCA 3 3 02 

Gin Arg Pro Gin Glu Val Pro 

61 



CA GAG GCT CTG GCC CCG GCC CCA GTG GAA GTC CCA GCT CCA GCC CCG 
la Glu Ala Leu Ala Pro Ala Pro Val Glu Val Pro Ala Pro Ala Pro 
640 645 650 



AA TTC GAT ATC AAG CTT ATC GAT ACC GTC GAC CTG CAG 3 3 89 

lu Phe Asp lie Lys Leu lie Asp -Thr Val Asp Leu Gin 
655 660 664 

2) INFORMATION FOR SEQ ID NO: 2 

5 (i) SEQUENCE CHARACTERISTICS: 

jS (A) LENGTH: 1954 base pairs 

m (B) TYPE: nucleic acid 

*j (C) STRANDEDNESS: double 

f_ (D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 2 
TdifCAGGCC AGGTGGGAGC GCAGACGGTC TCGGGTGGCA AAAGAAGCAT TGCAGGTCTG 60 
CA§TTGTGA GGCCGCTCAG AAGTGTGCAC CTGCTTGATA TGTCCGTTCA AGTGATCAGG 12 0 
CTGGAGAAG CCTTTCCCAC AGCTCTGGCA GATGTAAGGC GGAATTCCCC AGAGAAGAAG 180 
GTGGTGAAG ACTCCCGGCT CTCAGCTGCC CCCTGCATCA GACCCAGCAG CTCCCCTCCC 24 0 
CTGTGGCTC CCGCATCTGC CTCCCTGCCC CAGCCCATCC TCTCTAACCA AGGAATCATG 3 00 
rCGTTCAGG AGGAGGCCCT GGCCAGCAGC CTCTCGTCCA CTGACAGTCT GACTCCCGAG 3 60 
^.CCAGCCCA TTGCCCAGGG ATGTTCTGAT TCCTTGGAGT CCATCCCTGC GGGACAGGCA 42 0 
2TTCCGATG ATTTAAGGGA CGTGCCAGGA GCTGTTGGTG GTGCAAGCCC AGAACATGCC 480 
\GCCGGAGG TCCAGGTGGT GCCGGGGTCT GGCCAGATCA TCTTCCTGCC CTTCACCTGC 540 
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TTGGCTACA CGGCCACCAA TCAGGACTTC ATCCAGCGCC TGAGCACACT GATCCGGCAG 600 

CCATCGAGC GGCAGCTGCC TGCCTGGATC GAGGCTGCCA ACCAGCGGGA GGAGGGCCAG 6 60 

GTGAACAGG GCGAGGAGGA GGATGAGGAG GAGGAAGAAG AGGAGGACGT GGCTGAGAAC 720 

GCTACTTTG AAATGGGGCC CCCAGACGTG GAGGAGGAGG AGGGAGGAGG CCAGGGGGAG 7 80 

AAGAGGAGG AGGAAGAGGA GGATGAAGAG GCCGAGGAGG AGCGCCTGGC TCTGGAATGG 8 40 

CCCTGGGCG CGGACGAGGA CTTCCTGCTG GAGCACATCC GCATCCTCAA GGTGCTGTGG 900 

GCTTCCTGA TCCATGTGCA GGGCAGTATC CGCCAGTTCG CCGCCTGCCT TGTGCTCACC 960 

ACTTCGGCA TCGCAGTCTT CGAGATCCCG CACCAGGAGT CTCGGGGCAG CAGCCAGCAC 102 0 

TCCTCTCCT CCCTGCGCTT TGTCTTTTGC TTCCCGCATG GCGACCTCAC CGAGTTTGGC 1080 

TCCTCATGC CGGAGCTGTG TCTGGTGCTC AAGGTACGGC ACAGTGAGAA CACGCTCTTC 1140 

TTATCTCGG ACGCCGCCAA CCTGCACGAG TTCCACGCGG ACCTGCGCTC ATGCTTTGCA 12 00 

CCfkGCACA TGGCCATGCT GTGTAGCCCC ATCCTCTACG GCAGCCACAC CAGCCTGCAG 12 60 

AGifTCCTGC GCCAGCTGCT CACCTTCTAC AAGGTGGCTG GCGGCTGCCA GGAGCGCAGC 13 2 0 

AGiGCTGCT TCCCCGTCTA CCTGGTCTAC AGTGACAAGC GCATGGTGCA GACGGCCGCC 13 8 0 

GGlSACTACT CAGGCAACAT CGAGTGGGCC AGCTGCACAC TCTGTTCAGC CGTGCGGCGC 144 0 

CCpGCTGCG CGCCCTCTGA GGCCGTCAAG TCCGCCGCCA TCCCCTACTG GCTGTTGCTC 1500 

CGj&CCAGC ACCTCAACGT CATCAAGGCC GACTTCAACC CCATGCCCAA CCGTGGCACC 1560 

AC|jkCTGTC GCAACCGCAA C AG CTTCAAG CTCAGCCGTG TGCCGCTCTC CACCGTGCTG 162 0 

TGGACCCCA CACGCAGCTG TACCCAGCCT CGGGGCGCCT TTGCTGATGG CCACGTG CTA 168 0 

AGCTGCTCG TGGGGTACCG CTTTGTCACT GCCATCTTCG TGCTGCCCCA CGAGAAGTTC 174 0 

ACTTCCTGC GCGTCTACAA CCAGCTGCGG GCCTCGCTGC AGGACCTGAA GACTGTGGTC 18 00 

TCGCCAAGA CCCCCGGGAC GGGAGGCAGC CCCCAGGGCT CCTTTGCGGA TGGCCAGCCT 18 6 0 

CCGAGCGCA GGGCCAGCAA TGACCAGCGT CCCCAGGAGG TCCCAGCAGA GGCTCTGGCC 192 0 

CGGCCCCAG TGGAAGTCCC AGCTCCAGCC CCGG 1954 
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3) INFORMATION FOR SEQ ID NO: 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3318 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 3 



lATTCCAGTT 


TAATACTAAC 


CCTAATGTGT 


GACTGCGGTT 


TACAAAGAGC 


TCTGTATCAC 


60 


:tgggatagc 


TTTCAGTAGC 


AATTCACTAC 


AACTGGTCCT 


AAAAAATAAT 


AACAATAATA 


120 


iTAATAATTA 


GAGAATTAAA 


ACCCAACAGC 


ATGTTGAATG 


GTTAAAATCA 


CGTAAGAACT 


180 


;aaatttggg 


GTGGGGGTGT 


CCTCAACAGC 


TGAGCTTGTC 


CTAGCAGTGA 


AAATGCTCGC 


240 


:tc|aagcag 


GGCTCAGAAA 


GGTCTGGAGC 


CCTCCAGGCA 


GAGGGCTGAG 


CTCAGGGGGC 


300 


?ctJ|ggagga 


CACTCACCCC 


ATGGTCCATG 


GGATGCTTCT 


GGCTTCCTTA 


AAAACAGTTG 


360 


jgqItccgca 


TTGTATAAGT 


AGGTGGAGAC 


CCTAGTGTGG 


TTCTTTTGAA 


GGATATGGGA 


420 


.gg^aggatg 


ACGAACTAGA 


GAAGTGGGAG 


GG.GACCAAAA 


TCACTGAGGT 


CCCAGAATAT 


480 


jatagatttg 


GGTATAGGAT 


TGGGGTCACT 


AAGAATTGAG 


CACCAGGAAT 


TCCAGCTTCT 


540 


?ccm!attaaa 


GAAACTGGGA 


CTGGTTTTGC 


CTTGGAGGCC 


TATGTAGTGT 


TTTCTGCCCC 


600 


:gt!ccatac 


CAAGTCTCAT 


TGATATTTCT 


GCAGAATATC 


AGATGAAAAT 


CTATTTCTAA 


660 


iGAHCATTGG 


GAGAATGGGT 


GGTGGAGAAG 


GAGTTGGAGT 


GGGGTTGGGG 


GGCAGTTAAA 


720 


lATGAATAAA 


AATCTCTCAG 


CTACAGAACC 


CAAACATCAC 


TTCCCTCCGC 


AT T C A C AG C A 


780 


?TTCCCAGCA 


GTCCCCAGAT 


GGTTGTTTCC 


GTGGGGACAC 


AGCAGCTGCC 


TCATTTCCCT 


840 


?CAGGCCCCA 


TGGGCTGCTG 


GTCAACCTCA 


GGATCTACTA 


AAGATGACGC 


AAATGCCGAC 


900 


'GAACAATCT 


GAAACCCAAA 


GGACTCGAGG 


AGAGACATGT 


TCTGCTGAGG 


AGAGAAAGGT 


960 


fAGCCAAGGG 


CAGGGCCCAG 


GTCCCCCAGG 


GGGCCCCCGA 


GAGCCCGGAC 


ATGCACCTTC 


1020 


'GGATGTGTT 


TGTTCAAGTA 


GGACTTAGAG 


CGGAAGAAGC 


TCCCACATTC 


AGGGCATGGG 


1080 


'ACTTCTTCT 


CCCCATCAGA 


CTCCATTTTG 


TTTTTGGGGA 


CTGCCATGTC 


GC AG GAG AAA 


1140 
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\GCCATTGG CACTCTGCTT CTCTGGCGTC TTCAGGTCGC TGGCATCTGA GAGGTCACCA 12 00 

^GGAGTCAG AGCTCTCAAT CGGATCCTGA TGTGAGCATT TCTGGCCTTC TCGGTTACAG 12 60 

TACTGCAGA AGTTGCTGGG CCCCTCGCTG TGCTTCTTCA GGTGGTCTGC CATGTATGCT 13 2 0 

ZCCGCAAGT ACTTCCCACA CACCTGGCAG GGCACCTTGT CTTCATGACA GGCCAGGTGG 13 80 

\GCGCAGAC GGTCTCGGGT GGCAAAAGAA GCATTGCAGG TCTGACACTT GTGAGGCCGC 144 0 

IAGAAGTGT GCACCTGCTT GATATGTCCG TTCAAGTGAT CAGGCCTGGA GAAGCCTTTC 1500 

ZACAGCTCT GGCAGATGTA AGGCGGAATT CCCCAGAGAA GAAGGGTGGT GAAGACTCCC 1560 

3CTCTCAGC TGCCCCCTGC ATCAGACCCA GCAGCTCCCC TCCCACTGTG GCTCCCGCAT 162 0 

CGCCTCCCT GCCCCAGCCC ATCCTCTCTA ACCAAGGAAT CATGTTCGTT CAGGAGGAGG 168 0 

rCTGGCCAG CAGCCTCTCG TCCACTGACA GTCTGACTCC CGAGCACCAG CCCATTGCCC 174 0 

3GGATGTTC TGATTCCTTG GAGTCCATCC CTGCGGGACA GGCAGCTTCC GATGATTTAA 18 00 

3G|fGTGCC AGGAGCTGTT GGTGGTGCAA GCCCAGAACA TGCCGAGCCG GAGGTCCAGG 18 60 

3Gf£CCGGG GTCTGGCCAG ATCATCTTCC TGCCCTTCAC CTGCATTGGC TACACGGCCA 192 0 

:A||CAGGA CTTCATCCAG CGCCTGAGCA CACTGATCCG GCAGGCCATC GAGCGGCAGC 1980 

SC^fGCCTG GATCGAGGCT GCCAACCAGC GGGAGGAGGG CCAGGGTGAA CAGGGCGAGG 2 04 0 

SG^GGATGA GGAGGAGGAA gaagaggagg ACGTGGCTGA GAACCGCTAC TTTGAAATGG 2100 

;cdpCCAGA CGTGGAGGAG GAGGAGGGAG GAGGCCAGGG GGAGGAAGAG GAGGAGGAAG 2160 

;gaHgatga AGAGGCCGAG gaggagcgcc TGGCTCTGGA ATGGGCCCTG GGCGCGGACG 2220 

5G2g:TTCCT GCTGGAGCAC ATCCGCATCC TCAAGGTGCT GTGGTGCTTC CTGATCCATG 2 2 80 

JCAGGGCAG TATCCGCCAG TTCGCCGCCT GCCTTGTGCT CACCGACTTC GGCATCGCAG 2 340 

ITTCGAGAT CCCGCACCAG GAGTCTCGGG GCAGCAGCCA GCACATCCTC TCCTCCCTGC 24 00 

:TTTGTCTT TTGCTTCCCG CATGGCGACC TCACCGAGTT TGGCTTCCTC ATGCCGGAGC 24 60 

JTGTCTGGT GCTCAAGGTA CGGCACAGTG AGAACACGCT CTTCATTATC TCGGACGCCG 2 52 0 

lAACCTGCA CGAGTTCCAC GCGGACCTGC GCTCATGCTT TGCACCCCAG CACATGGCCA 2580 

1CTGTGTAG CCCCATCCTC TACGGCAGCC ACACCAGCCT GCAGGAGTTC CTGCGCCAGC 2 64 0 

ICTCACCTT CTACAAGGTG GCTGGCGGCT GCCAGGAGCG CAGCCAGGGC TGCTTCCCCG 27 00 
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TACCTGGT 


CTACAGTGAC 


AAGCGCATGG 


TGCAGACGGC 


CGCCGGGGAC 


TACTCAGGCA 


2760 


ATCGAGTG 


GGCCAGCTGC 


ACACTCTGTT 


CAGCCGTGCG 


GCGCTCCTGC 


TGCGCGCCCT 


2820 


GAGGCCGT 


CAAGTCCGCC 


GCCATCCCCT 


ACTGGCTGTT 


GCTCACGCCC 


CAGCACCTCA 


2880 


GTCATCAA 


GGCCGACTTC 


AACCCCATGC 


CCAACCGTGG 


CACCCACAAC 


TGTCGCAACC 


2940 


AACAGCTT 


CAAGCTCAGC 


CGTGTGCCGC 


TCTCCACCGT 


GCTGCTGGAC 


CCCACACGCA 


3000 


TGTACCCA 


GCCTCGGGGC 


GCCTTTGCTG 


ATGGCCACGT 


GCTAGAGCTG 


CTCGTGGGGT 


3060 


CGCTTTGT 


CACTGCCATC 


TTCGTGCTGC 


CCCACGAGAA 


GTTCCACTTC 


CTGCGCGTCT 


3120 


AACCAGCT 


GCGGGCCTCG 


CTGCAGGACC 


TGAAGACTGT 


GGTCATCGC.C 


AAGACCCCCG 


3180 


ACGGGAGG 


CAGCCCCCAG 


GGCTCCTTTG 


CGGATGGCCA 


GCCTGCCGAG 


CGCAGGGCCA 


3240 


AATGACCA 


GCGTCCCCAG 


GAGGTCCCAG 


CAGAGGCTCT 


GGCCCCGGCC 


CCAGTGGAAG 


3300 


CCAGCTCC 


AGCCCCGG 













) Information for seq id no: 4 
sequence characteristics: 

(A) LENGTH: 1171 base pairs 

G ( B ) TYPE: nucleic acid 

Hi (C) STRANDEDNESS: double 

m (D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
3GAGGAGG AAGAGGAGGA TGAAGAGGCC GAGGAGGAGC GCCTGGCTCT GGAATGGGCC 60 
3GGCGCGG ACGAGGACTT CCTGCTGGAG CACATCCGCA TCCTCAAGGT GCTGTGGTGC 12 0 
2CTGATCC ATGTG CAGGG CAGTATCCGC CAGTTCGCCG CCTGCCTTGT GCTCACCGAC 180 
UGGCATCG CAGTCTTCGA GATCCCGCAC CAGGAGTCTC GGGGCAGCAG CCAGCACATC 240 
:TCCTCCC TGCGCTTTGT CTTTTGCTTC CCGCATGGCG ACCTCACCGA GTTTGGCTTC 3 00 
:ATGCCGG AGCTGTGTCT GGTGCTCAAG GTACGGCACA GTGAGAACAC GCTCTTCATT 3 60 
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'CTCGGACG 


CCGCCAACCT 


GCACGAGTTC 


CACGCGGACC 


TGCGCTCATG 


CTTTGCACCC 


420 


.GCACATGG 


CCATGCTGTG 


TAGCCCCATC 


CTCTACGGCA 


GCCACACCAG 


CCTGCAGGAG 


480 


•CCTGCGCC 


AGCTGCTCAC 


CTTCTACAAG 


GTGGCTGGCG 


GCTGCCAGGA 


GCGCAGCCAG 


540 


rCTGCTTCC 


CCGTCTACCT 


GGTCTACAGT 


GACAAGCGCA 


TGGTGCAGAC 


GGCCGCCGGG 


600 


.CTACTCAG 


GCAACATCGA 


GTGGGCCAGC 


TGCACACTCT 


GTTCAGCCGT 


GCGGCGCTCC 


660 


ICTGCGCGC 


CCTCTGAGGC 


CGTCAAGTCC 


GCCGCCATCC 


CCTACTGGCT 


GTTGCTCACG 


720 


ICCAGCACC 


TCAACGTCAT 


CAAGGCCGAC 


TTCAACCCCA 


TGCCCAACCG 


TGGCACCCAC 


780 


.CTGTCGCA 


ACCGCAACAG 


CTTCAAGCTC 


AGCCGTGTGC 


CGCTCTCCAC 


CGTGCTGCTG 


840 


.CCCCACAC 


GCAGCTGTAC 


CCAGCCTCGG 


GGCGCCTTTG 


CTGATGGCCA 


CGTGCTAGAG 


900 


'GCTCGTGG 


GGTACCGCTT 


TGTCACTGCC 


ATCTTCGTGC 


TGCCCCACGA 


GAAGTTCCAC 


960 


CCTGCGCG 


TCTACAACCA 


GCTGCGGGCC 


TCGCTGCAGG 


A C CTGAAG AC 


TGTGGTCATC 


1020 


:CAlGACCC 


CCGGGACGGG 


AGGCAGCCCC 


CAGGGCTCCT 


TTGCGGATGG 


CCAGCCTGCC 


1080 


GCgCAGGG 


CCAGCAATGA 


CCAGCGTCCC 


CAGGAGGTCC 


C AG C AGAGGC 


TCTGGCCCCG 


1140 


CCi|^GTGG 


AAGTCCCAGC 


TCCAGCCCCG 


G 
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O (5) INFORMATION 


FOR SEQ ID 


NO: 5 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 amino acids 

(B) TYPE: polypeptide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 5 

Met Thr Gly Gin Val Gly Ala Gin Thr Val Ser 
1 5 10 

Gly Gly Lys Arg Ser lie Ala Gly Leu Thr Leu Val Arg Pro Leu Arg 
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Ser Val His Leu Leu Asp Met Ser Val Gin Val lie Arg Pro Gly Glu 
30 35 40 



Ala Phe Pro Thr Ala Leu Ala Asp Val Arg Trp Asn Ser Pro Glu Lys 
45 50 55 



Lys Gly Gly Glu Asp Ser Trp Leu Ser Ala Ala Pro Cys lie Arg Pro 
60 65 70 75 



Ser Ser Ser Pro Pro Thr Val Ala Pro Ala Ser Ala Ser Leu Pro Gin 
80 85 90. 

Pro lie Leu Ser Asn Gin Gly lie Met Phe Val Gin Glu Glu Ala Leu 
95 100 105 

Ala Ser Ser Leu Ser Ser Thr Asp Ser Leu Thr Pro Glu His Gin Pro 
110 115 120 

lie Ala Gin Gly Cys Ser Asp Ser Leu Glu Ser lie Pro Ala Gly Gin 
125 130 135 

Ala Ala Ser Asp Asp Leu Arg Asp Val Pro Gly Ala Val Gly Gly Ala 
140 145 150 155 

Ser Pro Glu His Ala Glu Pro Glu Val Gin Val Val Pro Gly Ser Gly 
160 165 170 

Gin lie lie Phe Leu Pro Phe Thr Cys lie Gly Tyr Thr Ala Thr Asn 
175 180 185 

Gin Asp Phe lie Gin Arg Leu Ser Thr Leu lie Trp Gin Ala lie Glu 
190 195 200 

Trp Gin Leu Pro Ala Trp lie Glu Ala Ala Asn Gin Trp Glu Glu Gly 
205 210 215 

Gin Gly Glu Gin Gly Glu Glu Glu Asp Glu Glu Glu Glu Glu Glu Glu 
220 225 230 235 
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Asp Val Ala Glu Asn Arg Tyr Phe Glu Met Gly Pro Pro Asp Val Glu 
240 245 250 

Glu Glu Glu Gly Gly Gly Gin Gly Glu Glu Glu Glu Glu Glu Glu Glu 
255 260 265 



Asp Glu Glu Ala Glu Glu Glu Arg Leu Ala Leu Glu Trp Ala Leu Gly 
270 275 280 

Ala Asp Glu Asp Phe Leu Leu Glu His lie Arg lie Leu Lys Val Leu 
285 290 295 

Trp Cys Phe Leu lie His Val Gin Gly Ser lie Arg Gin Phe Ala Ala 
300 305 310 315 

Cys Leu Val Leu Thr Asp Phe Gly lie Ala Val Phe Glu lie Pro His 
320 325 330 

Gin Glu Ser Trp Gly Ser Ser Gin His lie Leu Ser Ser Leu Arg Phe 
335 340 345 

Val Phe Cys Phe Pro His Gly Asp Leu Thr Glu Phe Gly Phe Leu Met 
350 355 360 

Pro Glu Leu Cys Leu Val Leu Lys Val Arg His Ser Glu Asn Thr Leu 
365 370 375 

Phe lie lie Ser Asp Ala Ala Asn Leu His Glu Phe His Ala Asp Leu 
380 385 390 395 

Arg Ser Cys Phe Ala Pro Gin His Met Ala Met Leu Cys Ser Pro lie 
400 405 410 

Leu Tyr Gly Ser His Thr Ser Leu Gin Glu Phe Leu Arg Gin Leu Leu 
415 420 425 

Thr Phe Tyr Lys Val Ala Gly Gly Cys Gin Glu Arg Ser Gin Gly Cys 
430 435 440 
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Phe Pro Val Tyr Leu Val Tyr Ser Asp Lys Arg Met Val Gin Thr Ala 
445 450 455 

Ala Gly Asp .Tyr Ser Gly Asn lie Glu Trp Ala Ser Cys Thr Leu Cys 

460 465 470 475 



Ser Ala Val Arg Arg Ser Cys Cys Ala Pro Ser Glu Ala Val Lys Ser 

480 485 490 

Ala Ala lie Pro Tyr Trp Leu Leu Leu Thr Pro Gin His Leu Asn Val 

495 500 505 

lie Lys Ala Asp Phe Asn Pro Met Pro Asn Arg Gly Thr His Asn Cys 

510 515 520 



Arg Asn Arg Asn Ser Phe Lys Leu Ser Arg Val Pro Leu Ser Thr Val 
525 530 535 

Leu Leu Asp Pro Thr Arg Ser Cys Thr Gin Pro Arg Gly Ala Phe Ala 

540 545 550 555 



Asp Gly His Val Leu Glu Leu Leu Val Gly Tyr Arg Phe Val Thr Ala 
560 565 570 

lie Phe Val Leu Pro His Glu Lys Phe His Phe Leu Arg Val Tyr Asn 
575 580 585 

Gin Leu Arg Ala Ser Leu Gin Asp Leu Lys Thr Val Val lie Ala Lys 
590 595 600 



Thr Pro Gly Thr Gly Gly Ser Pro Gin Gly Ser Phe Ala Asp Gly Gin 
605 610 615 

Pro Ala Glu Arg Arg Ala Ser Asn Asp Gin Arg Pro Gin Glu Val Pro 
620 625 630 635 

Ala Glu Ala Leu Ala Pro Ala Pro Val Glu Val Pro Ala Pro Ala Pro 
640 645 650 



(6) INFORMATION FOR SEQ ID NO: 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 90 amino acids 

(B) TYPE: polypeptide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 6 

Glu Glu Glu Glu Glu Glu 
1 5 

Asp Glu Glu Ala Glu Glu Glu Arg Leu Ala Leu Glu Trp Ala Leu Gly 
10 15 20 

Ala Asp Glu Asp Phe Leu Leu Glu His lie Arg lie Leu Lys Val Leu 
25 30 35 

Trp Cys Phe Leu lie His Val Gin Gly Ser lie Arg Gin Phe Ala Ala 
40 45 50 

Cys Leu Val Leu Thr Asp Phe Gly lie Ala Val Phe Glu lie Pro His 
55 60 ' 65 70 

Gin Glu Ser Trp Gly Ser Ser Gin His lie Leu Ser Ser Leu Arg Phe 
75 80 85 

Val Phe Cys Phe Pro His Gly Asp Leu Thr Glu Phe Gly Phe Leu Met 
90 95 100 

Pro Glu Leu Cys Leu Val Leu Lys Val Arg His Ser Glu Asn Thr Leu 
105 110 115 

Phe lie lie Ser Asp Ala Ala Asn Leu His Glu Phe His Ala Asp Leu 
120 125 130 

Arg Ser Cys Phe Ala Pro Gin His Met Ala Met Leu Cys Ser Pro lie 
135 140 145 150 



Leu Tyr Gly Ser His Thr Ser Leu Gin Glu Phe Leu Arg Gin Leu Leu 
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Thr Phe Tyr Lys Val Ala Gly Gly Cys Gin Glu Arg Ser Gin Gly Cys 
170 175 180 

Phe Pro Val Tyr Leu Val Tyr Ser Asp Lys Arg Met Val Gin Thr Ala 
185 190 195 

Ala Gly Asp Tyr Ser Gly Asn lie Glu Trp Ala Ser Cys Thr Leu Cys 
200 205 210 

Ser Ala Val Arg Arg Ser Cys Cys Ala Pro Ser Glu Ala Val Lys Ser 
215 220 225 230 

Ala Ala lie Pro Tyr Trp Leu Leu Leu Thr Pro Gin His Leu Asn Val 
235 240 245 

lie Lys Ala Asp Phe Asn Pro Met Pro Asn Arg Gly Thr His Asn Cys 
250 255 260 

Arg Asn Arg Asn Ser Phe Lys Leu Ser Arg Val Pro Leu Ser Thr Val 
265 270 275 

Leu Leu Asp Pro Thr Arg Ser Cys Thr Gin Pro Arg Gly Ala Phe Ala 
280 285 290 

Asp Gly His Val Leu Glu Leu Leu Val Gly Tyr Arg Phe Val Thr Ala 
295 300 305 310 

He Phe Val Leu Pro His Glu Lys Phe His Phe Leu Arg Val Tyr Asn 
315 320 325 

Gin Leu Arg Ala Ser Leu Gin Asp Leu Lys Thr Val Val He Ala Lys 
330 335 340 

Thr Pro Gly Thr Gly Gly Ser Pro Gin Gly Ser Phe Ala Asp Gly Gin 
345 350 355 



Pro Ala Glu Arg Arg Ala Ser Asn Asp Gin Arg Pro Gin Glu Val Pro 
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360 



365 



370 



Ala Glu Ala Leu Ala Pro Ala Pro Val Glu Val Pro Ala Pro Ala Pro 



(7) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
CTTGAGGATG CGGATGTGCT 2 0 

(8) INFORMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 8 
CCATGGGGTG AGTGTCCT 18 

(9) INFORMATION FOR SEQ ID NO: 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO 
AGGACACTCA CCCCATGG 18 

(10) INFORMATION FOR SEQ ID NO: 10 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO 
GTATGGGACA GGGGCAGAAA 20 



(11) INFORMATION FOR SEQ ID NO: 11 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO 
TTTCTAAAGA CCATTGGGAG 2 0 

(12) INFORMATION FOR SEQ ID NO: 12 
(i) SEQUENCE CHARACTERISTICS: 




(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
CCATTTTAAA GTAGCGGTTC 2 0 

(13) INFORMATION FOR SEQ ID NO: 13 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
AGGAGAGAAA GGTGAGCCAA 2 0 

(14) INFORMATION FOR SEQ ID NO: 14 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
GTAGATCCTG AGGTTGACCA 2 0 

(15) INFORMATION FOR SEQ ID NO: 15 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
TGTGAGCATT TCTGGCCTTC 2 0 

(16) INFORMATION FOR SEQ ID NO: 16 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
TGAAGACGCC AGAGAAG CAG 20 

(17) INFORMATION FOR SEQ ID NO: 17 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
GCCTCACAAG TGTCAGACCT 20 
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(18) INFORMATION FOR SEQ ID NO: 18 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 
AGAAGGGTGG TGAAGACT 18 

(19) INFORMATION FOR SEQ ID NO: 19 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO 
CTTGGTTAGA GAGGATGGGC 2 0 

(20) INFORMATION FOR SEQ ID NO: 2 0 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO 




GCCCATCCTC TCTAACCAAG 20 

(21) INFORMATION FOR SEQ ID NO: 21 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 02 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ix) FEATURE: 

(A) NAME /KEY: 

(B) LOCATION: 

(C) IDENTIFICATION METHOD: 

(D) OTHER INFORMATION: /note="N is unknown or other" 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 

GATCCGAGCTCAATTAACCCTCACTAAAGGGAGTCGACTCGATCCTTAAA 
ATATTCATATCTCCTGGACAACCTGTGGCCATAGTGCCTGACTGTAAACC 
CAAAGGGTTTGCCTTTGCCAGTGTAGCCCAGCCTGGTGTCTGCTGCCCCT 
CGCGGTGTCTGTGCACCTGCCACGATGCTGACCAGACACCCTTAACCAGG 
TTCACCCATCGCCTGGGCCTGGAGCAGTCCCCCTGATGCTCTGATTGGTC 
CTTGGACCTTCTGTTCTCCCAAAATCCCAGGTCAGAAAATACCTGGAAGT 
CTATTTGTGTCCCACCTCCCTCTTTGTGGCCGCAAGTGCCCCTTCCTCCA 
CACAGTCACAAGACCATGAGATGCCATCTCCTCCCCTCCTGGGCTGCAGA 
CTTTGGGAAGCTCCCAGGCCACAGAGGTGTCAGCTCCTGTCCAGGCCCTT 
GGGACCTTCCCTCATTCAACCACCCTACCCAACCCCCCACTGCCTGCCAG 
CCACCACTCCCTCCCACATTTGCAGGCGGGGGCCCTGCCCTCTCCTGCCG 
CTGGTTCCCCTACCCAGGAGGCTCTCCCATCGCTCTTTTGAGAGTCTGCC 
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TCCCACCTCTAACTGGGGGCTTAGTTCAAGTTGCCCCCTTACCCTAGTCC 
CAGCTGCCCAAGAGCTTGCTGCCTCCTGTTCTTGGTGAGGGACTCCAGAG 
ACAGATGTGAGACCTCCCTGGACCCCTCCAAGGCATTCCCAGGTCACTTC 
CATGAGTAGTGAAGAACCGCCTCTGAGCAGGCTGAGCCTCCCTCAGCCTA 
TGGTGTCCTCACGTGGCTTGGCCCACAGCAGGTGCTCACGCCTCCTCCTC 
AGCAGAGCCTACCATCCTCCTGCCATGCTCACCAGTCCCCATGCTGATAG 
CCATCACCAGTCCCCATGCTGATAGCCATCACCAGTCCCCATGCTGATAG 
CCACTTTCTGGATGCTCTAGGTCTGTCTGGATGACACAGTGACCACAGAG 
AAGGAGCTGGACACTGTGGAAGTGCTGAAAGCAATTCAGAAAGCCAAGGA 
GGTCAAGTCCAAACTGAGCAACCCAGAGAAGAAGGTGGGTTTGTGTGGCA 
GGTGGGAGGGCAGTGGTGCAGAGCCAGCCGGGATAGGAGCCAGTTCGGGG 
GGCTTGGGCCATGGGACTGCTCAGGGCTGCCGAGTCCCAGCTGCGCCCCT 
CCCTGGCTGCATGACCTCGGGCAAGTCGCGGCCTCTCTGTTCTCTGTGGG 
GTGGGGACAGTGGTAGTTCCTGCTCTAAGGATATGATGAGACCATCTTTA 
CCACCCAGTTGGTGGGAACCGTTGCGCTCCCTCCTCACACCCCTGGCCTT 
GGGGAGCTCTGTGCTTCCTCTTCTCTCCCGGGCTGACTCAAGCACTCGTC 
CTCAGGGTGGTGAAGACTCCCGGCTCTCAGCTGCCCCCTGCATCAGACCC 
AGCAGCTCCCCTCCCACTGTGGCTCCCGCATCTGCCTCCCTGCCCCAGCC 
CATCCTCTCTAACCAAGGTAATCGTGTATGTATCTTGCTTCTAGTGGAGC 
CACACAGCCCTGCCTGGGCCCCCTGGCTGGGCTGGGGTTGGGGGAGAGGT 
GCCAGCACCTGCTTCCAACAGGGTCAGACACAGGGAGGGCAGTGCCTTCT 
GCAGGCTGGTCCTCGCGGGGGGACACATGGCAGGGGTGCCTGGCCTGATG 
CCAGCTGTTGCTTGCTTGGTGAGGACTCCCAATTGCTCTGATGCCCACAT 
CCAGCTCCTCjTAGGAGACCGCAGGGTGTCTGACAGGCCCTGAGGCTGCCC 
TCTGAACAGG'CTCGGGGCTGTTGGCTCATGGGACCCATTCCCTCACCGGC 
AGCACAAGCAGGTTGGCTCCTGGTTACAGGAAGCCGGGCTTGTGACTTTA 
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CTGTCTGGAGCCCGAATCCCTGTGCAGGGAAAAGCTTGCTiTTTATCACTG 

IS'! 

CCTCATCTCTpTGGGGTGACCCAGCCCCAGAACACCATGTTTGTGGGGCC 
AAGATGGGCCATCTCTGTCCCTGTGGACCCATGGAAGACCAGGCCCATTC 
GTCTGCCCACTATCTTAGCGTTTTCAAAGGGCTTTCACCTCTGAACCCAG 
GCATCCTCGGAGATGAGTGAGTGAAGCAGGTCTCATGAGCGTGTCTGCTG 
GCCCGGCCCCCACGGAAGAGGGGAGGGTGTpCCGTCCCGAGTGGAGCCGA 
GGCTCGGGACACGCAGGAAAGGACGCCGCCTGCCCGGGCTCCTGGAGACG 
CAGAACTTGGTGTGAGGTCTTGGGAAAACAGTTCAACCCGATGTTTTAAG 
AGCCAGAAAAACATTCCCACCCCTTGACCTGGTAACCCCACTGGTGGGGA 
TTTTCTCTTAGAGGGATAAGATACCGGGAAGGGGAGGTGAAATGCTCACC 
ACTGCCAAAACACGGGCTGCAACTGCAACATCGGAGGATGAGAGGGAGAG 
TCGGCTGTGGTGCAGAATGCTCAGCAGCCCTCCCAGCAGGGACAGGAAGA 
CTGGGCAGGAAGAGGGGAGAAGCATTCAAGTTAAGGCAAAAGGCCCAACG 
CAGAGCAGCACACTGAGGTCACACCTGTGAGATGTGGAAGAGAATTCCTG 
AGCGTGGAGCGATGGGGTTAGGTGCCAGGATGATTGCCCATTTTGCTTCT 
GTCAGACTCTTGACTAAGGATTTCTGGTTGCATTTTATTACATAAAAGCC 
AGGGAGGTTATATCACGGTGAGAAAGCTTCCCTGACGCCGCCTCCTGTAG 
CGCAGCCAAGCGAGCCTGTGGAGGTACCATATGACTGTAGGCCTCTGGGG 
ACAGGGAGCTGCATCTGCTTCTCAAGGCCAGGGACACAGCCATTTCTGCC 
AGCATCTGTTGATCAGTGAGTGAGTGAGTGGGCAGGTAGAGCAGGAGCCA 
GTGAAGAGCAGGCCCTGGATGGGTGGGGATGCACCATGTCCCCAGGCTGC 
AGCTGCAGGCAGCCCCCCACATTGTCGGAGAAGCCTCTGCACCAGCTCAG 
CCCCCTCCTCACTCCCCTTGTGCCCTGGGGACACTCTGCAGAGGGGCACT 
CTGCAGTCTGTCCCCGCCATCGCTGGACTTCTGGACATGGCCTCCAGATT 
TGCACCTCTTAAATAAATCTGCAGTGGATGTCTTTGTGTGCACCTCTCTT 
TCCTTTTGGTGAGAAACAGCAAAGATCGGACCCCTAAGGACTCTCCTGAT 
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gtctccgctco?atccgctga ! gtgccctttc|rgaccacttgjrttgtacagg 
ccacggtccaLgacgggagcagatagactgtccctgtccqtgtccacatt 
tccttggtccaaacagggcttgtgggaggtagtggcaaaa'ggtgttggtc 
tttttctcactgatttggaggcctccccgtgtgttttttcagccgcgtgt 
tcctgggtcttgcctggatggacagggttttttagcgcgtgggagcagct 
ttgctgaccatgcctgttgcttccagcctgattcccgagaagggagcgtg 
cttgcgaaggaactggcactcgggcctgcctgaagggggcgctgtccaga 
cacacccagcctcccgtcgtggcaggcgctgtcggagccatggatgattg 
tgaccaataggggtggtcgccagagttgattgtccagccaggcccagggg 
ctgagaggaggctgtgtggagaggtggttaggagccagggctcggtcagc 
tgagttcgcatgccagcttcctagctgtgggacctcaagcaacttgtagc 
ccctctgaagctgttttctcaactgtgaagtggacgcaccctacttcatt 
gattctaagaggcacgcatttccaccttgtgacttctctgaaactgaggt 
gcgtctttcagtcagtggcgtctcatagtcgctgtcagccagctggtatt 
cgagatggagtcgtggaaaacccgtggacaccttccgctaggaccaagat 
ggcgccacctgccgcatcttagatttgatgaaatgtggtaaataacgaga 
ggcatgcatgagcgaatgctggggaggcgcttggcactacccagagctcc 
acagaggtggtcgatgagggctgccctttcccacatccttagtagggggt 
tcaagatgacccagactgtgcccctggggagcttggagccatgcgggagg 
atgagccatgtgctggaggagaacagggtaggatggtgtggggcttttgt 
agactgtctagagcagagaaggtctgcagtggaggtggtgtctgaggtga 
atctcgaaggtgaataggagttgaacgttagcaggcagagggtggattgc 
aggagagcagcggcctgggcaggtgcccagcgtggcccatcagggtgctt 
catgcatggctgtgtgcttgccatccttcctgcctgcctaccccctgctg 
cttcgcttcatgggggcgtttgagcttgggcccacctgcctgcctcgctt 
gtgggcagaggacccaggctgtgtgagttgtcctgtcccggggagcagct 
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GAGCTTGTCqGGGTTCCTCG^CCTGTGGGG^TTCAGAGGACTTCGGGTCA 
TTTCAATGGGCTGTGGCGATGCTGGCTGTGGAGGTAGCCTAGGGCTCCTG 
TAGCCTTCAGTGAGACTGGCGGCCCGATGCCCAGTGTTCACCCTGCTGGC 
GGCAGTCAGGAACATGTTCACAAAGCTTTACTTCAAGTGGTCTAGAGGTG 
ATCTGAGGTGGAGTAACAGGTCCAGATAGGCTACGTTCATAAAACAGCTT 
CAGCGGGGTTTAGGAACACTGTGCATTTACGGGACGCAGTGGGTCAGAGT 
GCTGCTGTCCGTGGGAGGTGGCCCCAGGGCAGGTCAGTGGGCACGTCCTG 
TGGTAAGTGGGACTGTGGATGTGGGCTCAGGCTGGACTCAGCAGCCCTGC 
TGGATACCAAGGCCTGCAAGGGCTGGCCCCCTGGTGAATTGTCCCGTGCC 
CTGTGTATCTATGAGTCCTGCAGAGATGACAAATCAGGGGACGGGGTCAT 
GTCTAGTCACCGTCTGGGAAAATGCTCCAGGAGTGAACACATTTCAGGCT 
CTTGATGGATGTACCTCCAAACTCTTCTCTGGATGGGTGGGCCAGCTTGC 
ATGCCTGTGCCGGCCTCTGCCCAGCGAGGTCAGGGCCAGGCCACACAGTC 
AGTCTGACTTTGGCAGAAGTTGAGAGGCAACACTTGTCTCTTGTTTCAGC 
TTGCCTTTCTTTGTGTACTTCTGAGAGCGAGCATTCTTTTCATGTTCTAT 
CCGCTGGCCGTTCTTCTGCGGAATGTCTGTTCACGTCCTTTGCAGTCTGT 
TAATGAGGTTTCCAACCTTCCCTCATTTTTGTAATCTGTAAGAACTTTTT 
CCAGACTAGCGATATAAATCCTTGTCAAATATTGCAAACACTTTTCTCAT 
TTCATCTGGTTTTAATCTATCCTGGTTTTTAAAAAATGTGTCTGTGGAAG 
TTTAATTTTTATGTAGTCACATCTCAGTTTTTTTCCATTGCATTTATTCT 
CAGAATGCTTCTCCCTGCCCTGAGATTAGATAAGCAGTCATTTGTTCTTT 
CTTGAGTTATTTTGAGATTTCAGTTTTAACATTTTCTTCTATAATCCATG 
TGGCTGGGTTTTGGGATCTGGCTAACCCCCGCCATGCCAGTAGCCTGAGG 
GGCCCAGCCCCACTTGTTGAACAGCCGCTCTCCCCGCCCCACCCACCCTG 
CCTGCCTGCC.CACCCGCCCTGGTCTCTCCAGGAATCATGTTCGTTCAGGA 
GGAGGCCCTGGCCAGCAGCCTCTCGTCCACTGACAGTCTGACTCCCGAGC 
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ACCAGCCCATTGCCCAGGGATGTTCTGATTCCTTGGAGTCfcATCCCTGCG 
GGACAGGTAATGCCCTCTTCCCGCTTCTGGGGACCATACATCTGTGGGTG 
GACTCTTCTGCTTGGGGTTGTGTGCAGTAGGAAGTGGCCTAGCTGGAGCT 
GAGGCAGATGCTTCCAGGGTTTGGCGTCCTCTGCTTTGCGCCACGGTCTT 
TCTCTTGGACCTGTCTCTGGTTGAGTGTCTTCCTGACAAACACAGTGGTT 
AAGGGTTTATTTTCAGCCTCCCTCCTTCCCTTCCCCACCCACCTTGGTTG 
ATGGGAACAGGCAGTTCTCTGTCACTGGGCCCAGGGCACGAGGGGGGCAG 
GTGGAGAGGGTGGCCCTTGACCCTGTGAGCAGGCTTCCCTGGGGAAGGCA 
TTTCAAAAGACCCTCGTGCAGGGGCTTGTTTGGGTTTCTTCTCTGTTTCC 
TGGCACCCCTGGAGCCACTCGGCGCCTTTCCGCATGTCACCCTGGTGGTC 
TGGGAAACAGTCTCACTCTGGCGCCTCCTCTGTGGTTGTTACTGAGAGTT 
CTGGGGCCCCTTCCTTTGTCCTGAGGAAAGACAGGAGGAAAGCAAGGGTG 
CTTGCTGTGTGCTTCGCAAATGTGCTTGGTGCCTGGGCCTCCCTCCAGCC 
CCA'TCTCTGCAGCAGCACAAGGTTATGGCCTTGTGACACTGGGACAGTTT 
GCAGAGTCCTTGTCTGTCCTCAGTACTCCACAGTATTCTGCCATCACCCT 
TTCCAGGGTCACACAGCAAGAGATTCCCAAGCCCTAGGTATTCCCCAGTG 
CACAGAGACCATTGGGAGGGACTTGCCAGGGCTGTGTCCACTGCTGGCCA 
GTTAGGGTCGGACCAAATTTGTAGACTGTCTACCTGGACCCTTGCGTGGC 
ACAAGGAGCAGTCAGATGCTGGATCCCTGGAGAGTGGCGAGAGGCTCTGG 
CCTTAGGTTGCGAGTGGGAATCCCAGCCCTGCTGTGTGCTGGTGGGATAA 
CCAAGTGGGTCTCTGCCCTTGGGTCCCAGAGTGGGCCCCAGGGTCCCAGA 
GTGGGCTCCAGGGTACAGCGTGGGGATGGGGAGCCTCCTCAGGGCGGTGA 
TGGAGGGCAGAATGCCCAGCTCAGGGTCTGGCAACCAGTAAATGGCTGGG 
GCTGGCTGCAGTAGGTGGGGACTGACTGTGTTTCTTTCTCCATCAGGCAG 
CTTCCGATGATTTAAGGGACGTGCCAGGAGCTGTTGGTGGTGCAAGGTAA 
GGAAGAGGTTGGAAAGGGACCTGGGCCTGGCCACACAGCCTTATGCACAC 
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ACACTGCTGTjSGGCCAGGGGTGGCCAGTCAGGTTTTTTTA U^AATCCGTT 
CACAGAAGGCbTATAGAACTATTTCTTCCTCTAAAGAGAC \CAGATGAGA 
TGGACTTTTCAATCTGTTTCCAAATTCTAATACCTAAACT 



i 



'pTGCTCAGCA 
TGTGGCCCA 



CATGTTGCCCTACACCAGGGGTTGGCAAATCAAGGCCTGTG 
CAGCCTGGGAGCTAAGAATGACAGTTACATTCTTTTTTCTrTTTTTGAGA 
CTGAGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGTGGCG TGTTCTTGGC 
TCACTGCAACCCCCGCCTCCCAGATTAATGCAATTTTCCTGTCTCAGCCT 
CAGCCTTCTGAGTAGCCCGGACCACAGGCGCACGCCACCACGCCCAACTA 
ATTTTTTATATTTTTAGTAGAGACAGAGATTCACCATGTdGCCTAGCTGG 
TCTCGAACTC'CTGAACTCCAGTGATCCACCAACCTCGGCTTCCTAAAGTA 
CTGGAATTACAGGCATGAGCCACCGCGCCTGGCTAGAATi ACAGTTACTT 
TTTTTTTCTTTGAGACTGAGTCTTGCTTTGTCACCCAGGCTGGAGTGCAG 
TGGCACGATCTCAGCTCGCTGCAACCTCCGCCTCCCGGGTTCAAGCGATT 
CTTCTGCCTCAGCCACCCAAGGTGCCCGCCACCACACCTGGCTAATTTTT 
CTGTTTTTAGTAGGGACAGGATTTCGCCATGTTGGACAGTrTACATTCTTA 
AAGGGCTGCTGAAGATCGTATGGACATGGTAGCCCATAAATCCCAAAATG 
TGTACTCTGACCCTTTACAGAAGCTTACTAACTCCCACTCTACATGTGAG 
GGCTGCGGTGGCCAAGAAGAGCTGGAATTTAAGTGTGAAGGTCCTAAGAC 
CTGCCCCAGCCCACTTCCCTGCCCCGGAGGCCACCAGGGGTGACAAGTAG 
ATTCATGCCCTGGAGTGTTCCTTCTCTCCGGGGCTTATGGCAGCAACTGA 
ATGACTTAGAAGTCCATGGGAGTGCTTTCTGTTGTGGGAACTCGTGTGGT 
CTGGGCATAGCTGTGCCAGGCACCTATGGTCCAAGCCCCTAGAAGCATAG 
ACTCTGACCAAACTGGCGACCCAGCCTTCCAGCAGGCAGCACTGGCTCCC 
ACCAGGGCCCTCATCCTGGGAACTGACTTGGCCATGTGGGAGGCTTGGGA 
GACCCATGGGTTGGTTTCTCAGGGTCAGGGTGTAGCAGTGGGCTCCAGAT 
GTGGCAGGTGGGAGGTGGGAGGGGCCCCTCCCAGCATGCCACTGACCTGG 
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CCTCTCCCTGCACAGCCCAG^CATGCCG^GCCGGAGGTCpAGGTGGTGC 
CGGGGTCTGGCCAGATCATCrTCCTGCCCT,TCACCTGCATTGGCTACACG 

| ; 

GCCACCAATCAGGACTTCATjCCAGCGCCTGAGCACACTGATCCGGCAGGC 

fl •! ! '• 

CATCGAGCGGCAGCTGCCTGCCTGGATCGAGGCTGCCAACCAGCGGGAGG 

\ ! I 

AGGGCCAGGGTGAACAGGGCjGAGGAGGAGGrATGAGGAGGAbGAAGAAGAG 

S ' r I 

GAGGACGTGGCTGAGAACCGiCTACTTTGAAATGGGGCCCCCAGACGTGGA 

: ' ! I 

GGAGGAGGAGGGAGGAGGCCAGGGGGAGGAAGAGGAGGAGGAAGAGGAGG 
ATGAAGAGGCCGAGGAGGAGtGCCTGGCTCTGGAATGGGCCCTGGGCGCG 
GACGAGGACTTCCTGCTGGAGCACATCCGCATCCTCAAGGTGCTGTGGTG 

CTTCCTGATCCATGTGCAGGGCAGTATCCGCCAGTTCGCCGCCTGCCTTG 

; • ! ; 

TGCTCACCGACTTCGGCATCGCAGTCTTCGAGATCCCGCACCAGGAGTCT 
CGGGGCAGCAGCCAGCACATpCTCTCCTCC'CTGCGCTTTGTCTTTTGCTT 
CCCGCATGGCGACCTCACCGAGTTTGGCTTCCTCATGCCGGAGCTGTGTC 
TGGTGCTCAAGGTACGGCACAGTGAGAACACGCTCTTCATTATCTCGGAC 
GCCGCCAACCTGCACGAGT^CCACGCGGACCTGCGCTCATGCTTTGCACC 
CCAGCACATGGCCATGCTGTGTAGCCCCATCCTCTACGGCAGCCACACCA 
GCCTGCAGGXGTTCCTGCGCCAGCTGCTCACCTTCTACAAGGTGGCTGGC 
GGCTGCCAGGAGCGCAGCCAGGGCTGCTTCCCCGTCTACCTGGTCTACAG 
TGACAAGCGCATGGTGCAGACGGCCGCCGGGGACTACTCAGGCAACATCG 
AGTGGGCCAGCTGCACACTCTGTTCAGCCGTGCGGCGCTCCTGCTGCGCG 

i i ■ 

CCCTCTGAGGCCGTCAAGTdcGCCGCCATCCCCTACTGGCTGTTGCTCAC 

! I : 

GCCCCAGCACCTCAACGTCATCAAGGCCGACTTCAACCCCATGCCCAACC 
GTGGCACCCACAACTGTCGC AACCGCAACAGCTTCAAGCTCAGCCGTGTG 
CCGCTCTCCACCGTGCTGCIGGACCCCACAbGCAGCTGTACCCAGCCTCG 
GGGCGCCTTTGCTGATGGCGACGTGCTAGAbcTGCTCGTGGGGTACCGCT 
TTGTCACTGCCATCTTCGTGCTGCCCCACGAGAAGTTCCACTTCCTGCGC 
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GTCTACAACC^GCTGCGGGCCTCGCTGCAGGACCTGAAGAbTGTGGTCAT 
CGCCAAGACCCCCGGGACGG3AGGCAGCCCCCAGGGCTCCTTTGCGGATG 

GCCAGCCTGCCGAGCGCAGGGCCAGGTGAGATCAAGCACAGCTCTCAGGG 

I ■ 

gccccgggggcacgggtctggcatgtgtgtgatctcagcatctgcggcta 
gtgtgggctgggagttgctgcgagagctgggccccctcccccctgcccct 
cgccccccccgggcctccct;ctacatcaccaccccaggtttggtgccagg 
ctgctccttatctcagtgctgtagaagaagcccaggaaagctgtcctctc 
acaaaatggg'ttggcccagcctcttgccacccatgaagggcaggccaagg 
gggctgccccacctttgcctgcccagtgggagagcaacaggctgcagcac 
accgaggccaggagagctgtcaccctggctgctgtgctcctctgggccca 
agcatggcctctgggcactacctcctccagggtcacagtcccacggatgg 
ctctgtgggccaggatctgccttaggcttcacccacctcaacatcttgct 
gtgttgttcaggctggtctcaaactttgggctcaaacaatcctccgcctc 
agcctcccaaagtgctgggattacagacatgagccaccgtgcccggccgt 

GCTGTTCTGTTCTCCAATAGAGAAGCTGGTGGAAGTCCCCAGTAACCCAG 

aggtgatgtgtgatgcacacagtctcctcactctgaagctgcacatgcga 
tgtgaatcttcatttggggtccgctgttaatatggtgtttttcgggggat 
acagcaatgaccagcgtccccaggaggtcccagcagaggctctggccccg 
gccccagtggaagtcccagctccagcccctgcagcagcctcagcctcagg 
cccagcgaagactccggccccagcagaggcctcaacttcagctttggtcc 
cagaggagacgccagtggaagctccagccccacccccagccgaggcccct 
gcccagtacccgagtgagcacctcatccaggccacctcggaggagaatca 
gatcccctcgcacttgcctgcctgcccgtcgctccggcacgtcgccagcc 
tgcggggcagcgccatcatcgagctcttccacagcagcattgctgaggta 
gcggcccgggtgtgggtgccagctatggcacggccagtcctgagggcgag 
gccaagcttggcttcaggtcagcctcaggtccctggactticcctgatgtc 
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ggagtcctcagctgagctgcjtcacagctttgaggacctgg'gcagtgaggt 
cctgagttgccctccctggccatttgtgctgtgtcaccacctcctgtgcc 
acttccagccccaggtagacctcccaccaacagccatctcccacccctct 
cttcctctctgccttgaagcatacggattcattggtgagccaagaggggc 
ttcccatgtctccttgtggaagctgtgggcatgtccctggtatgtgcagg 
ttgctagggtggtggagctgacaggaggccccccgtcttcaggttgaaaa 
cgaggagctgaggcacctcatgtggtcctcggtggtgttctaccagaccc 
cagggctggaggtgactgcctgcgtgctgctctccaccaaggctgtgtac 
tttgtgctccacgacggcctccgccgctacttctcagagccactgcaggg 
taggcacagggcctgctggggctcaggagcttggagtgtgtggttggggc 
aggcctggggggtcattctctggagccagctgtgtggcttcaggcagcag 
tcagcgacttggctgcagtgggctgagagttccttgtctgaggaagggag 
ctgtcatgagggaggggtccatggccagatgtgaacgcagaatgcactga 

GCCAGGGCCTGGTGACTGCTTGGGAACAGCCTGTGATGAGAAGGGGTTAG 
GCAGCCTTTGCCCCTGGGGCTGCACAGGAAGCCCTAGCCAGCGACCTGGT 
GACTCCCCTGAGCTGGAAGAGGCTCAGACTCCAGAGGGCATTGCCTATGG 
GGCTTTGCACGGGTGGAAGCCAGGCCAGCCAAGAGGACCTGTTCCTGCTG 
GATGTGCTGCACACCTAGGAACCTTGTGCTTGCCTGCCACCGCCTCCCTC 
TGTCCCTTTCTCCATCACACAGATTTCTGGCATCAGAAAAACACCGACTA 
CAACAACAGCCCTTTCCACATCTCCCAGTGCTTCGTGCTAAAGCTTAGTG 
ACCTGCAGTCAGTCAATGTGGGGCTTTTCGACCAGCATTTCCGGCTGACG 
GGTGGGTGACCCTCTGTGCTTTGTCCTATTTCGGGTGAAGGCCAGCATCA 
CCAGTGGGCTTCCACCTTCCGTACGTGGGTGGGTTATCATAGACAGTTAT 
CTCTGTGCTCAAGAGCCACTTCTTACCCGGGGTGGGAGGAAGCAGCTTCA 
GGAACTGCTGAGAGAGCAGAACTCACGCTCCAGGGCTCAGAGCAGGAGGT 
AGGGTGTGCGGCAAGCGCTGGCCCGGACAGAAGCAGAGTGpGCCCTGGTC 
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TCGGGCAGGATGTTTCTGACTCACATTTCCjrGAGGAGAGAAAGCTAAGCT 

! I 

CTTTGCCTAATGTCTCTGTCTCCCCTTCCAGAAAAATGCCTCAGCTCTTC 
CGGCCTGAAGGAATGGCCTCCTCCCGGGCCCCATGATTCTTTCCTGTGTG 
GGCCCTCCTGGCCCTGGCCTCTGGGCTGAGGCTTGCTAGGGACTCGGGGT 
GGCTCTAAGGGGCAGGGATAGGGCTGGGGAGCGCCGGCCTGTGGCCCTGA 
CCAGCCCCTTCTCGTGCAGGTTCCACCCCGATGCAGGTGGTCACGTGCTT 
GACGCGGGACAGCTACCTGACGCACTGCTTCCTCCAGCACCTCATGGTCG 
TGCTGTCCTCTCTGGAACGCACGCCCTCGCCGGAGCCTGTTGACAAGGAC 
TTCTACTCCGAGTTTGGGAACAAGACCACAGGTACCCCTGTCTAGCTCAG 
GCTGCAGACAGGCTGCCTGGACAGACGTCATGGGCCCCAGGGTGGCTCTC 
TGTGCCCCAGAACCCTCTCTGCCTCTATGTCTCTCTTTTCTCACTTAGCT 
GGCCAGGGTTTTATGTGGGGCTTTTCGATGGCAGAGTCTCCACTCCAGCA 
GTCCCTCAACCATCTGGCAGACACATCTCCAGTGCCTGCTTTGGGCTCCT 
GGCCTGTGGGCCCCACACTTGGAGCATCCTCTCCTGCCTGTCTCATGCCG 
GGGTCTCTCGGTTGGCTTGGGGCCCTTGGTGCTCCCAGCCCCACCAGGGG 
CCGGTTCCAGGCTATAGCCCAGGTGGCATCTCTCTGCAGGGAAGATGGAG 
AACTACGAGCTGATCCACTCTAGTCGCGTCAAGTTTACCTACCCCAGTGA 
GGAGGAGATTGGGGACCTGACGTTCACTGTGGCCCAAAAGATGGCTGAGC 
CAGAGAAGGCCCCAGCCCTCAGCATCCTGCTGTACGTGCAGGCCTTCCAG 
GTGGGCATGCCACCCCCTGGGTGCTGCAGGGGCCCCCTGCGCCCCAAGAC 
ACTCCTGCTCACCAGCTCCGAGATCTTCCTpCTGGATGAGGACTGTGTCC 
ACTACCCACTGCCCGAGTTTGCCAAAGAGC'CGCCGCAGAGAGACAGGTAC 
CGGCTGGACGATGGCCGCCGCGTCCGGGAcbTGGACCGAGTGCTCATGGG 
CTACCAGACCTACCCGCAGGCCCTCACCCTCGTCTTCGATGACGTGCAAG 

i 

GTCATGACCTCATGGGCAGTGTCACCCTGGACCACTTTGGGGAGGTGCCA 
GGTGGCCCGGCTAGAGCCAGCCAGGGCCGTGAAGTCCAGTGGCAGGTGTT 




TGTCCCCAGTGCTGAGAGCAGAGAGAAGCTbATCTCGCTGTTGGCTCGCC 
AGTGGGAGGCCCTGTGTGGCCGTGAGCTGCCTGTCGAGCTCACCGGCTAG 
CCCAGGCCACAGCCAGCCTGTCGTGTCCAGbcTGACGCCTACTGGGGCAG 
GGCAGCAGGCTTTTGTGTTCTCTAAAAATGTTTTATCCTCCCTTTGGTAC 
CTTAATTTGACTGTCCTCGCAGAGAATGTGAACATGTGTGTGTGTTGTGT 
TAATTCTTTCTCATGTTGGGAGTGAGAATGCCGGGCCCCTCAGGGCTGTC 
GGTGTGCTGTCAGCCTCCCACAGGTGGTACAGCCGTGCACACCAGTGTCG 
TGTCTGCTGTTGTGGGACCGTTGTTAACACGTGACACTGTGGGTCTGACT 
TTCTCTTCTACACGTCCTTTCCTGAAGTGTCGAGTCCAGTCCTTTGTTGC 
TGTTGCTGTTGCTGTTGCTGTTGCTGTTGGCATCTTGCTGCTAATCCTGA 
GGCTGGTAGCAGAATGCACATTGGAAGCTCCCACCCCATATTGTTCTTCA 
AAGTGGAGGTCTCCCCTGATCCAGACAAGTGGGAGAGCCCGTGGGGGCAG 
GGGACCTGGAGCTGCCAGCACCAAGCGTGATTCCTGCTGCCTGTATTCTC 
TATTCCAATAAAGCAGAGTTTGACACCGTCTGCATCTTCTAAACCAAGGG 
TCACTGGGATCGAGTCGACGGCCCTATAGTGAGTCGTATTAGAGCTCGCG 
GCCGCGAGCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATC 
TGCAGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCCAC 
AGAGGTGGTCGATGAGGGCTGCCCTTTCCCACATCCTTAGTAGGGGGTTC 
AAGATGACCCAGACTGTGCCCCTGGGGAGCTTGGAGCCATGCGGGAGGAT 
GAGCCATGTGCTGGAGGAGAACAGGGTAGGATGGTGTGGGGCTTTTGTAG 
ACTGTCTAGAAGCAAAGAAGGTCTGCAGTGGAGGTGGTGTCTGAGGTGAA 
TCTCGAAGGTGAATAGGAGTTGAACGTTAGbAGGCAGAGGGTGGATTGCA 
GGAGAGCAGCGGCCTGGGCAGGTGCCCAGCGTGGCCCATCAGGGTGCTTC 
ATGCATGGCTGTGTGCTTGCCATCCTTCCT'GCCTGCCTACCCCCTGCTGC 
TTCGCTTCATGGGGGCGTTTGAGCTTGGGCCCACCTGCCTGCCTCGCTTG 
TGGGCAGAGGACCCAAGCTGTGTGAGTTGTCCTGTCCCGGGGAGCAGCTG 
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AACTGGTCCGGGGTCTCGAApTGTGGGGCTCAAAAGGACTCCGGGGTCAT 
TTCACTGGGGCTGTGCCGATTCCTGGGGGCTGTTNGGAAN3TAAAGGCCT 
AAAGGGGCTQCCTGGTTANGGCCCTCAANTTTAANAACCT3GGGCCGGGG 
CCCGGAATTGCCCCCAANTTTGTTTCAACNCCCCTTGGCC TTNGGCNGGG 
GCAAATTTCCANGGGGAACCAATGGNTTTCCCCCAAAAANGGGGCCNTTT 

taacccnttJccaaantttgggncctaaaa|aagggtggan ITCCTGAANG 



(22) INFORMATION FOR SEQ ID NO: 2 2 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1070 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) sequence description: seq id no: 22 
vclddtvttekeldtvevlkaiqkakevksklsnpekkggedsrlsaapcirpssspptvapasa 
slpqpilsnqgimfvqeealasslsstdsltpehqpiaqgcsdslesipagqaasddlrdvpgav 
ggaspehaepevqvvpgsgqiiflpftcigytatnqdfiqrlstlirqaierqlpawieaanqre 

EGQGEQGEEEDEEEEEEEDVAENRYFEMGPPDVEEEEGGGQGEEEEEEEEDEEAEEERLALpWAL 
GADEDFLLEHIP^TLKVLWCFLIHVQGSlfeQFAACLVLTDFGIAVFEIPHQESRGSSQHILfeSLRF 
VFCFPHGDLTEFGFLMPELCLVLKVRHSENTLFIISDAANLHEFHADLRSCFAPQHMAMLCSPIL 
YGSHTSLQEFLRQLLTFYKVAGGCQERSQGCFPVYLVYSDKRMVQTAAGDYSGNIEWASCTLCSA 
VRRSCCAPSEAVKSAAIPYWLLLTPQHLNVIKADFNPMPNRGTHNCRNRNSFKLSRVPLSTVLLD 
PTRSCTQyRGAFADGHVLELLVGYjRFVTAIFVLPHEKFHFLRVYNQLRASLQDLKTWIAKTPGT 
GGSPQG^FADGQPAERRASNDQRPQEVPAEALAPAPVEVPAPAPAAASASGPAKTPAPAEASTSA 
LVPEETPVEAPAPPPAEAPAQYPSEHLIQATSEENQIPSHLPACPSLRHVASLRGSAIIELFHSS 
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IAEVE^EELRHIJy[WSSWFY,Q^PGLEVTACVLLSTKAVYFVLHDGLKRYFSE^LQDFWHQKNTDY 
NNSPFHISQCFVLKLSDLQSVNVGLFDQHFRLTGSTPMQWTCLTRDSYLTH^CFLQHLMVVLSSL 

ER^PSPEPVDKDFYSEFGNKTTGKMENYELIHSSRVKFTYPSEEEIGDLTFTVAQKMAEPEKAPA 

J ' 

LSILLYVQAFQVGMPPPGCCRGPLRPKTLLLTSSEIFLLDEDCVHYPLPEFAKEPPQRDRYRLDD 
GRRVRDLDRVLMGYQTYPQALTLVFDDVQGHDLMGSVTLDHFGEVPGGPARASQGREVQWQVFVP 
SAESREKLISLLARQWEALCGRELPVELTG 
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WHAT IS CLAIMED IS: 

CLAIMS 

1. A DNA molecule encoding for a polypeptide including an amino 
acid sequence which is receptive to imidazoline compounds, said 
DNA molecule containing a DNA sequence with at least 75% sequence 
similarity with the DNA sequence shown in SEQ ID No. 4. 

2. A DNA molecule according to claim 1, containing a DNA 
sequence with at least 75% sequence similarity with the DNA 
sequence shown in SEQ ID No. 2. 

3. A DNA molecule according to claim 2, containing a DNA 
sequence with at least 75% sequence similarity with the DNA 
sequence of SEQ ID No. 3. 

4. A DNA molecule according to claim 3, containing a DNA 
sequence with at least 75% sequence similarity with the DNA 
sequence of SEQ ID No. 1. 

5. A DNA molecule according to any one of claims 1 to 4 , 
containing a DNA sequence with at least 80% sequence similarity 
with the sequence of said SEQ ID No. 

6. A DNA molecule according to any one of claims 1 to 4 , 
containing a DNA sequence with at least 85% sequence similarity 
with the sequence of said SEQ ID No. 



92 




7. A DNA molecule according to any one of claims 1 to 4, 
containing a DNA sequence with at least 90% sequence similarity 
with the sequence of said SEQ ID No. 

8. A DNA molecule according to any one of claims 1 to 4, 
containing a DNA sequence with at least 95% sequence similarity 
with the sequence of said SEQ ID No. 

9. A DNA molecule according to claim 1, which is deposited with 
the ATCC under deposit accession no. ATCC 2 09217. 

10. A genomic DNA molecule encoding for a polypeptide including 
an amino acid sequence which is receptive to imidazoline 
compounds, and wherein exon portions of said genomic DNA molecule 
include the DNA sequence as defined in claim 1. 

11. A genomic DNA molecule according to claim 10, which is 
deposited with the ATCC under deposit accession no. ATCC 209216. 

12. A 1110 bp Apal-EcoRI restriction fragment of the DNA 
molecule according to claim 1. 

13. A 1.85 kb EcoRI restriction fragment of the DNA molecule 
according to claim 4. 
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14. A vector containing a DNA sequence as defined in any one of 
claims 1-13. 

15. A host cell transfected with a vector as defined in claim 
14. 

16. An isolated polypeptide including a site which is receptive 
to imidazoline compounds, said polypeptide containing an amino 
acid sequence with at least 80% sequence similarity with the 
amino acid sequence shown in SEQ ID No. 6. 

17. A polypeptide as defined in claim 16, having a molecular 
weight of about 3 5 to 4 5 kDa. 

18. A polypeptide as defined in claim 17, having a molecular 
weight of about 3 7 kDa. 

19. An isolated polypeptide including a site which is receptive 
to imidazoline compounds, said polypeptide containing an amino 
acid sequence with at least 80% sequence similarity with the 
amino acid sequence shown in SEQ ID No. 5. 

20. A polypeptide as defined in claim 19, having a molecular 
weight of about 60 to 85 kDa. 
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21. A polypeptide as defined in claim 20, having a molecular 
weight of about 70 kDa. 

22. A fragment of the amino acid sequence shown in SEQ ID No. 5 
or 6, which fragment is receptive to imidazoline compounds. 

23. A polypeptide according to any one of claims 16 to 22, which 
is immunoreactive with at least one of Reis antiserum and 
Dontenwill antiserum. 

24. A polypeptide according to any one of claims 16 to 23, which 
is a human polypeptide. 

25. A method of producing an isolated polypeptide including an 
amino acid sequence which is receptive to imidazoline compounds, 
said method comprising: 

transfecting a host cell with a vector as defined in claim 
14; and 

culturing the transfected host cell in a culture medium to 
express the polypeptide. 

26. An isolated polypeptide including an amino acid sequence 
which is receptive to imidazoline compounds, which polypeptide is 
expressed by the method of claim 25. 
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27. A method of screening for a ligand of an imidazoline 
receptor, which method comprises: 

culturing a host cell as defined in claim 15 in a culture 
medium to express a polypeptide including an amino acid sequence 
which is receptive to imidazoline compounds; 

contacting said polypeptide with a labelled ligand for the 
imidazoline receptor under conditions effective to bind the 
labelled ligand thereto; 

contacting said polypeptide with a candidate ligand; and 

detecting any displacement of the labelled ligand from said 
polypeptide, wherein displacement signifies that the candidate 
ligand is a ligand for the imidazoline receptor. 

28. The method of claim 27, wherein said contacting steps are 
performed in an intact cultured host cell. 

29. The method of claim 27, further comprising isolating the 
cell membrane of said cultured host cell prior to performing said 
contacting steps. 

30. The method of claim 27, wherein said contacting of said 
imidazoline receptive polypeptide with said candidate ligand is 
conducted at a plurality of candidate ligand concentrations. 

31. The method of claim 27, wherein the labelled ligand is 
radiolabelled. 
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32. A method of obtaining a DNA material encoding a polypeptide 
which is receptive to imidazoline compounds, said method 
comprising: 

providing a labelled DNA probe by labelling a DNA molecule 
identical or complementary to a DNA molecule as defined in any 
one of claims 1 to 9 or a restriction fragment thereof; 

contacting said DNA probe with genetic material suspected of 
encoding said imidazoline receptive polypeptide; 

hybridizing said DNA probe and said genetic material under 
stringent hybridization conditions ; 

identifying any portion of the genetic material which 
hybridizes to said DNA probe; and 

isolating said identified material. 

33. A method according to claim 32, wherein the genetic material 
is derived from a library selected from the group consisting of 
RNA library, cDNA library and genomic DNA library. 

34. A method according to claim 33, wherein said library is a 
human library. 

35. A method according to claim 32, wherein the labelled DNA 
probe is provided by labelling a restriction fragment according 
to claim 12 or 13. 
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36. A method of raising antibodies immunoreactive with a 
polypeptide which is receptive to an imidazoline compound, which 
method comprises: 

injecting an animal with a polypeptide as defined in any one 
of claims 16 to 24 and 26; and 

isolating antibodies produced by the animal. 



ABSTRACT 



A genomic DNA encoding a human imidazoline receptor is 
described. cDNAs encoding the receptor and fragments thereof are 
also provided. An amino acid sequence predicted to be 120,000 MW 
for nearly the entire protein is identified, as well as a middle 
fragment believed to contain the imidazoline binding site of the 
receptor. The protein is highly unique in its sequence and may 
represent the first in a novel family of receptor proteins. 
Methods of cloning the cDNA and expressing the imidazoline 
i receptor in a host cell are described. Methods of preparing 
antibodies against the transfected protein are also described. 
Also, a screening method for identifying additional subtypes of 
this receptor are identified. Also, screening methods for 
identifying drugs that interact with the imidazoline receptor are 
described. 
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SEQUENCE LISTING 



<110> PILETZ, John E. 

IVANOV, Tina R. 

<120> DNA MOLECULES ENCODING IMIDALINE RECEPTIVE POLYPEPTIDES 
'AND POLYPEPTIDES ENCODED THEREBY 



<130> Corrected Sequence Listing 



<140> 08/922,635 
<141> 1997-09-03 



<150> 08/650,766 
<151> 1996-05-20 



<150> 60/012,600 
<151> 1996-03-01 



<160> 22 

<170> Patentln Ver. 2.0 



<210> 1 

<211> 3385 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1398) . . (3383) 



<400> 1 



gctctagaac tagtggatcc cccgggctgc aggaattcca gtttaatact aaccctaatg 60 
tgtgactgcg gtttacaaag agctctgtat cacctgggat agctttcagt agcaattcac 120 
tacaactggt cctaaaaaat aataacaata ataataataa ttagagaatt aaaacccaac 180 
agcatgttga atggttaaaa tcacgtaaga actgaaattt ggggtggggg tgtcctcaac 240 
agctgagctt gtcctagcag tgaaaatgct cgcctccaag cagggctcag aaaggtctgg 300 
agccctccag gcagagggct gagctcaggg ggctcttgga ggacactcac cccatggtcc 360 
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atgggatgct 


tctggcttcc 


ttaaaaacag 


ttgggcatcc 


gcattgtata 


agtaggtgga 


420 


gaccctagtg 


tggttctttt 


gaaggatatg 


ggaagggagg 


atgacgaact 


agagaagtgg 


480 


gaggggacca 


aaatcactga 


ggtcccagaa 


tatcatagat 


ttgggtatag 


gattggggtc 


540 


actaagaatt 


gagcaccagg 


aattccagct 


tcttcccatt 


aaagaaactg 


ggactggttt 


600 


tgccttggag 


gcctatgtag 


tgttttctgc 


ccctgtccca 


taccaagtct 


cattgatatt 


660 


tctgcagaat 


atcagatgaa 


aatctatttc 


taaagaccat 


tgggagaatg 


ggtggtggag 


720 


aaggagttgg 


agtggggttg 


gggggcagtt 


aaaaatgaat 


aaaaatctct 


cagctacaga 


780 


acccaaacat 


cacttccctc 


cgcattcaca 


gcatttccca 


gcagtcccca 


gatggttgtt 


840 


tccgtgggga 


cacagcagct 


gcctcatttc 


ccttcaggcc 


ccatgggctg 


ctggtcaacc 


900 


tcaggatcta 


ctaaagatga 


cgcaaatgcc 


gactgaacaa 


tctgaaaccc 


aaaggactcg 


960 


aggagagaca 


tgttctgctg 


aggagagaaa 


ggtgagccaa 


gggcagggcc 


caggtccccc 


1020 


agggggcccc 


cgagagcccg 


gacatgcacc 


ttctggatgt 


gtttgttcaa 


gtaggactta 


1080 


gagcggaaga 


agctcccaca 


ttcagggcat 


gggtacttct 


tctccccatc 


agactccatt 


1140 


ttgtttttgg 


ggactgccat 


gtcgcaggag 


aaagagccat 


tggcactctg 


cttctctggc 


1200 


gtcttcaggt 


cgctggcatc 


tgagaggtca 


ccataggagt 


cagagctctc 


aatcggatcc 


1260 


tgatgtgagc 


atttctggcc 


ttctcggtta 


cagatactgc 


agaagttgct 


gggcccctcg 


1320 


ctgtgcttct 


tcaggtggtc 


tgccatgtat 


gctgcccgca 


agtacttccc 


acacacctgg 


1380 


cagggcacct 


tgtcttc atg aca ggc cag gtg gga < 
Met Thr Gly Gin Val Gly \ 


gcg cag acg 
Ala Gin Thr 


gtc teg 
Val Ser 


1430 



15 10 

ggt ggc aaa aga age att gca ggt ctg aca ctt gtg agg ccg etc aga 1478 

Gly Gly Lys Arg Ser lie Ala Gly Leu Thr Leu Val Arg Pro Leu Arg 

15 20 25 

agt gtg cac ctg ctt gat atg tec gtt caa gtg ate agg cct gga gaa 1526 

Ser Val His Leu Leu Asp Met Ser Val Gin Val lie Arg Pro Gly Glu 
30 35 40 

gee ttt ccc aca get ctg gca gat gta agg egg aat tec cca gag aag 1574 

Ala Phe Pro Thr Ala Leu Ala Asp Val Arg Arg Asn Ser Pro Glu Lys 
45 50 55 

aag ggt ggt gaa gac tec egg etc tea get gee ccc tgc ate aga ccc 1622 

Lys Gly Gly Glu Asp Ser Arg Leu Ser Ala Ala Pro Cys lie Arg Pro 

60 65 70 75 

age age tec cct ccc act gtg get ccc gca tct gec tec ctg ccc cag 1670 

Ser Ser Ser Pro Pro Thr Val Ala Pro Ala Ser Ala Ser Leu Pro Gin 

80 85 90 

ccc ate etc tct aac caa gga ate atg ttc gtt cag gag gag gec ctg 1718 

Pro lie Leu Ser Asn Gin Gly lie Met Phe Val Gin Glu Glu Ala Leu 

95 100 105 
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gcc age age etc teg tec act gac agt ctg act ccc gag cac cag ccc 
Ala Ser Ser Leu Ser Ser Thr Asp Ser Leu Thr Pro Glu His Gin Pro 
110 115 120 

att gcc cag gga tgt tct gat tec ttg gag tec ate cct gcg gga cag 
lie Ala Gin Gly Cys Ser Asp Ser Leu Glu Ser lie Pro Ala Gly Gin 
125 130 135 

gca get tec gat gat tta agg gac gtg cca gga get gtt ggt ggt gca 
Ala Ala Ser Asp Asp Leu Arg Asp Val Pro Gly Ala Val Gly Gly Ala 
140 145 150 155 

age cca gaa cat gcc gag ccg gag gtc cag gtg gtg ccg ggg tct ggc 
Ser Pro Glu His Ala Glu Pro Glu Val Gin Val Val Pro Gly Ser Gly 
160 165 170 

cag ate ate ttc ctg ccc ttc ace tgc att ggc tac acg gcc ace aat 
Gin lie lie Phe Leu Pro Phe Thr Cys lie Gly Tyr Thr Ala Thr Asn 
175 180 185 

cag gac ttc ate cag cgc ctg age aca ctg ate egg cag gcc ate gag 
Gin Asp Phe lie Gin Arg Leu Ser Thr Leu lie Arg Gin Ala lie Glu 
190 195 200 

egg cag ctg cct gcc tgg ate gag get gcc aac cag egg gag gag ggc 
Arg Gin Leu Pro Ala Trp lie Glu Ala Ala Asn Gin Arg Glu Glu Gly 
205 210 215 

cag ggt gaa cag ggc gag gag gag gat gag gag gag gaa gaa gag gag 
Gin Gly Glu Gin Gly Glu Glu Glu Asp Glu Glu Glu Glu Glu Glu Glu 
220 225 230 235 

gac gtg get gag aac cgc tac ttt gaa atg ggg ccc cca gac gtg gag 
Asp Val Ala Glu Asn Arg Tyr Phe Glu Met Gly Pro Pro Asp Val Glu 
240 245 250 

gag gag gag gga gga ggc cag ggg gag gaa gag gag gag gaa gag gag 
Glu Glu Glu Gly Gly Gly Gin Gly Glu Glu Glu Glu Glu Glu Glu Glu 
255 260 265 

gat gaa gag gcc gag gag gag cgc ctg get ctg gaa tgg gcc ctg ggc 
Asp Glu Glu Ala Glu Glu Glu Arg Leu Ala Leu Glu Trp Ala Leu Gly 
270 275 280 

gcg gac gag gac ttc ctg ctg gag cac ate cgc ate etc aag gtg ctg 
Ala Asp Glu Asp Phe Leu Leu Glu His lie Arg lie Leu Lys Val Leu 
285 290 295 

tgg tgc ttc ctg ate cat gtg cag ggc agt ate cgc cag ttc gcc gcc 
Trp Cys Phe Leu lie His Val Gin Gly Ser lie Arg Gin Phe Ala Ala 
300 305 310 315 

tgc ctt gtg etc acc gac ttc ggc ate gca gtc ttc gag ate ccg cac 
Cys Leu Val Leu Thr Asp Phe Gly lie Ala Val Phe Glu lie Pro His 
320 325 330 

cag gag tct egg ggc age age cag cac ate etc tec tec ctg cgc ttt 
Gin Glu Ser Arg Gly Ser Ser Gin His lie Leu Ser Ser Leu Arg Phe 
335 340 345 
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gtc ttt tgc ttc ccg cat ggc gac etc acc gag ttt ggc ttc etc atg 2486 
Val Phe Cys Phe Pro His Gly Asp Leu Thr Glu Phe Gly Phe Leu Met 
350 355 360 

ccg gag ctg tgt ctg gtg etc aag gta egg cac agt gag aac acg etc 2534 
Pro Glu Leu Cys Leu Val Leu Lys Val Arg His Ser Glu Asn Thr Leu 
365 370 375 

ttc att ate teg gac gec gee aac ctg cac gag ttc cac gcg gac ctg 2582 
Phe lie lie Ser Asp Ala Ala Asn Leu His Glu Phe His Ala Asp Leu 
380 385 390 395 

cgc tea tgc ttt gca ccc cag cac atg gee atg ctg tgt age ccc ate 2630 
Arg Ser Cys Phe Ala Pro Gin His Met Ala Met Leu Cys Ser Pro lie 
400 405 410 

etc tac ggc age cac acc age ctg cag gag ttc ctg cgc cag ctg etc 2678 
Leu Tyr Gly Ser His Thr Ser Leu Gin Glu Phe Leu Arg Gin Leu Leu 
415 420 425 

acc ttc tac aag gtg get ggc ggc tgc cag gag cgc age cag ggc tgc 2726 
Thr Phe Tyr Lys Val Ala Gly Gly Cys Gin Glu Arg Ser Gin Gly Cys 
430 435 440 

ttc ccc gtc tac ctg gtc tac agt gac aag cgc atg gtg cag acg gee 2774 
Phe Pro Val Tyr Leu Val Tyr Ser Asp Lys Arg Met Val Gin Thr Ala 
445 450 455 

gec ggg gac tac tea ggc aac ate gag tgg gee age tgc aca etc tgt 2822 
Ala Gly Asp Tyr Ser Gly Asn lie Glu Trp Ala Ser Cys Thr Leu Cys 
460 465 470 475 

tea gec gtg egg cgc tec tgc tgc gcg ccc tct gag gec gtc aag tec 2870 
Ser Ala Val Arg Arg Ser Cys Cys Ala Pro Ser Glu Ala Val Lys Ser 
480 485 490 

gec gee ate ccc tac tgg ctg ttg etc acg ccc cag cac etc aac gtc 2918 
Ala Ala lie Pro Tyr Trp Leu Leu Leu Thr Pro Gin His Leu Asn Val 
495 500 505 

ate aag gec gac ttc aac ccc atg ccc aac cgt ggc acc cac aac tgt 2 966 
lie Lys Ala Asp Phe Asn Pro Met Pro Asn Arg Gly Thr His Asn Cys 
510 515 520 

cgc aac cgc aac age ttc aag etc age cgt gtg ccg etc tec acc gtg 3014 
Arg Asn Arg Asn Ser Phe Lys Leu Ser Arg Val Pro Leu Ser Thr Val 
525 530 535 

ctg ctg gac ccc aca cgc age tgt acc cag cct egg ggc gee ttt get 3062 
Leu Leu Asp Pro Thr Arg Ser Cys Thr Gin Pro Arg Gly Ala Phe Ala 
540 545 550 555 

gat ggc cac gtg eta gag ctg etc gtg ggg tac cgc ttt gtc act gee 3110 
Asp Gly His Val Leu Glu Leu Leu Val Gly Tyr Arg Phe Val Thr Ala 
560 565 570 

ate ttc gtg ctg ccc cac gag aag ttc cac ttc ctg cgc gtc tac aac 3158 
lie Phe Val Leu Pro His Glu Lys Phe His Phe Leu Arg Val Tyr Asn 
575 580 585 



cag ctg egg gec teg ctg cag gac ctg aag act gtg gtc ate gec aag 3206 
Gin Leu Arg Ala Ser Leu Gin Asp Leu Lys Thr Val Val lie Ala Lys 
590 595 600 



acc ccc ggg acg gga ggc age ccc cag ggc tec ttt gcg gat ggc cag 3254 

Thr Pro Gly Thr Gly Gly Ser Pro Gin Gly Ser Phe Ala Asp Gly Gin 

605 610 615 

cct gee gag cgc agg gec age aat gac cag cgt ccc cag gag gtc cca 3302 

Pro Ala Glu Arg Arg Ala Ser Asn Asp Gin Arg Pro Gin Glu Val Pro 

620 625 630 635 

gca gag get ctg gec ccg gec cca gtg gaa gtc cca get cca gec ccg 3350 

Ala Glu Ala Leu Ala Pro Ala Pro Val Glu Val Pro Ala Pro Ala Pro 

640 645 650 

gaa ttc gat ate aag ctt ate gat acc gtc gac ct 3385 

Glu Phe Asp lie Lys Leu lie Asp Thr Val Asp 

655 660 



<210> 2 

<211> 1954 

<212> DNA 

<213> Homo sapiens 

<400> 2 

atgacaggee aggtgggagc 
acacttgtga ggccgctcag 
cctggagaag cctttcccac 
ggtggtgaag actcccggct 
actgtggctc ccgcatctgc 
ttcgttcagg aggaggcect 
caccagccca ttgcccaggg 
gcttccgatg atttaaggga 
gagceggagg tccaggtggt 
attggctaca cggccaccaa 
gccatcgagc ggcagctgcc 
ggtgaacagg gcgaggagga 
cgctactttg aaatggggcc 
gaagaggagg aggaagagga 
gccctgggcg eggacgagga 



geagaeggtc tcgggtggca 
aagtgtgcac ctgettgata 
agctctggca gatgtaaggc 
ctcagctgcc ccctgcatca 
ctccctgccc cagcccatcc 
ggccagcagc ctctcgtcca 
atgttctgat tccttggagt 
cgtgccagga gctgttggtg 
geeggggtet ggecagatea 
tcaggacttc atccagcgcc 
tgcctggatc gaggctgeca 
ggatgaggag gaggaagaag 
cccagacgtg gaggaggagg 
ggatgaagag gecgaggagg 
cttcctgctg gagcacatcc 



aaagaagcat 


tgcaggtctg 


60 


tgtccgttca 


agtgatcagg 


120 


ggaattcccc 


agagaagaag 


180 


gacccagcag 


ctcccctccc 


240 


tctctaacca 


aggaatcatg 


300 


ctgacagtct 


gactcccgag 


360 


ccatccctgc 


gggacaggca 


420 


gtgcaagccc 


agaacatgee 


480 


tcttcctgcc 


cttcacctgc 


540 


tgagcacact 


gatceggcag 


600 


accageggga 


ggagggccag 


660 


aggaggacgt 


ggctgagaac 


720 


agggaggagg 


ccagggggag 


780 


agcgcctggc 


tctggaatgg 


840 


gcatcctcaa 


ggtgctgtgg 


900 



tgcttcctga 


tccatgtgca 


gggcagtatc 


cgccagttcg 


ccgcctgcct 


tgtgctcacc 


960 


gacttcggca 


tcgcagtctt 


cgagatcccg 


caccaggagt 


ctcggggcag 


cagccagcac 


1020 


atcctctcct 


ccctgcgctt 


tgtcttttgc 


ttcccgcatg 


gcgacctcac 


cgagtttggc 


1080 


ttcctcatgc 


cggagctgtg 


tctggtgctc 


aaggtacggc 


acagtgagaa 


cacgctcttc 


1140 


attatctcgg 


acgccgccaa 


cctgcacgag 


ttccacgcgg 


acctgcgctc 


atgctttgca 


1200 


ccccagcaca 


tggccatgct 


gtgtagcccc 


atcctctacg 


gcagccacac 


cagcctgcag 


1260 


gagttcctgc 


gccagctgct 


caccttctac 


aaggtggctg 


gcggctgcca 


ggagcgcagc 


1320 


cagggctgct 


tccccgtcta 


cctggtctac 


agtgacaagc 


gcatggtgca 


gacggccgcc 


1380 


ggggactact 


caggcaacat 


cgagtgggcc 


agctgcacac 


tctgttcagc 


cgtgcggcgc 


1440 


tcctgctgcg 


cgccctctga 


ggccgtcaag 


tccgccgcca 


tcccctactg 


gctgttgctc 


1500 


acgccccagc 


acctcaacgt 


catcaaggcc 


gacttcaacc 


ccatgcccaa 


ccgtggcacc 


1560 


cacaactgtc 


gcaaccgcaa 


cagcttcaag 


ctcagccgtg 


tgccgctctc 


caccgtgctg 


1620 


ctggacccca 


cacgcagctg 


tacccagcct 


cggggcgcct 


ttgctgatgg 


ccacgtgcta 


1680 


gagctgctcg 


tggggtaccg 


ctttgtcact 


gccatcttcg 


tgctgcccca 


cgagaagttc 


1740 


cacttcctgc 


gcgtctacaa 


ccagctgcgg 


gcctcgctgc 


aggacctgaa 


gactgtggtc 


1800 


atcgccaaga 


cccccgggac 


gggaggcagc 


ccccagggct 


cctttgcgga 


tggccagcct 


1860 


gccgagcgca 


gggccagcaa 


tgaccagcgt 


ccccaggagg 


tcccagcaga 


ggctctggcc 


1920 


ccggccccag 


tggaagtccc 


agctccagcc 


ccgg 






1954 


<210> 3 
<211> 3318 
<212> DNA 
<213> Homo 


sapiens 












<400> 3 














aattccagtt 


taatactaac 


cctaatgtgt 


gactgcggtt 


tacaaagagc 


tctgtatcac 


60 


ctgggatagc 


tttcagtagc 


aattcactac 


aactggtcct 


aaaaaataat 


aacaataata 


120 


ataataatta 


gagaattaaa 


acccaacagc 


atgttgaatg 


gttaaaatca 


cgtaagaact 


180 


gaaatttggg 


gtgggggtgt 


cctcaacagc 


tgagcttgtc 


ctagcagtga 


aaatgctcgc 


240 


ctccaagcag 


ggctcagaaa 


ggtctggagc 


cctccaggca 


gagggctgag 


ctcagggggc 


300 


tcttggagga 


cactcacccc 


atggtccatg 


ggatgcttct 


ggcttcctta 


aaaacagttg 


360 


ggcatccgca 


ttgtataagt 


aggtggagac 


cctagtgtgg 


ttcttttgaa 


ggatatggga 


420 


agggaggatg 


acgaactaga 


gaagtgggag 


gggaccaaaa 


tcactgaggt 


cccagaatat 


480 



catagatttg ggtataggat tggggtcact 
tcccattaaa gaaactggga ctggttttgc 
tgtcccatac caagtctcat tgatatttct 
agaccattgg gagaatgggt ggtggagaag 
aatgaataaa aatctctcag ctacagaacc 
tttcccagca gtccccagat ggttgtttcc 
tcaggcccca tgggctgctg gtcaacctca 
tgaacaatct gaaacccaaa ggactcgagg 
gagccaaggg cagggcccag gtcccccagg 
tggatgtgtt tgttcaagta ggacttagag 
tacttcttct ccccatcaga ctccattttg 
gagccattgg cactctgctt ctctggcgtc 
taggagtcag agctctcaat cggatcctga 
atactgcaga agttgctggg cccctcgctg 
gcccgcaagt acttcccaca cacctggcag 
gagcgcagac ggtctcgggt ggcaaaagaa 
tcagaagtgt gcacctgctt gatatgtccg 
ccacagctct ggcagatgta aggcggaatt 
ggctctcagc tgccccctgc atcagaccca 
ctgcctccct gccccagccc atcctctcta 
ccctggccag cagcctctcg tccactgaca 
agggatgttc tgattccttg gagtccatcc 
gggacgtgcc aggagctgtt ggtggtgcaa 
tggtgccggg gtctggccag atcatcttcc 
ccaatcagga cttcatccag cgcctgagca 
tgcctgcctg gatcgaggct gccaaccagc 
aggaggatga ggaggaggaa gaagaggagg 
ggcccccaga cgtggaggag gaggagggag 
aggaggatga agaggccgag gaggagcgcc 
aggacttcct gctggagcac atccgcatcc 
tgcagggcag tatccgccag ttcgccgcct 



aagaattgag caccaggaat tccagcttct 54 0 
cttggaggcc tatgtagtgt tttctgcccc 600 
gcagaatatc agatgaaaat ctatttctaa 660 
gagttggagt ggggttgggg ggcagttaaa 720 
caaacatcac ttccctccgc attcacagca 780 
gtggggacac agcagctgcc tcatttccct 84 0 
ggatctacta aagatgacgc aaatgccgac 900 
agagacatgt tctgctgagg agagaaaggt 960 
gggcccccga gagcccggac atgcaccttc 1020 
cggaagaagc tcccacattc agggcatggg 1080 
tttttgggga ctgccatgtc gcaggagaaa 1140 
ttcaggtcgc tggcatctga gaggtcacca 1200 
tgtgagcatt tctggccttc tcggttacag 1260 
tgcttcttca ggtggtctgc catgtatgct 1320 
ggcaccttgt cttcatgaca ggccaggtgg 1380 
gcattgcagg tctgacactt gtgaggccgc 1440 
ttcaagtgat caggcctgga gaagcctttc 1500 
ccccagagaa gaagggtggt gaagactccc 1560 
gcagctcccc tcccactgtg gctcccgcat 1620 
accaaggaat catgttcgtt caggaggagg 1680 
gtctgactcc cgagcaccag cccattgccc 1740 
ctgcgggaca ggcagcttcc gatgatttaa 1800 
gcccagaaca tgccgagccg gaggtccagg 18 60 
tgcccttcac ctgcattggc tacacggcca 1920 
cactgatccg gcaggccatc gagcggcagc 1980 
gggaggaggg ccagggtgaa cagggcgagg 20 40 
acgtggctga gaaccgctac tttgaaatgg 2100 
gaggccaggg ggaggaagag gaggaggaag 2160 
tggctctgga atgggccctg ggcgcggacg 2220 
tcaaggtgct gtggtgcttc ctgatccatg 2280 
gccttgtgct caccgacttc ggcatcgcag 2340 



tcttcgagat 


cccgcaccag 


gagtctcggg 


gctttgtctt 


ttgcttcccg 


catggcgacc 


tgtgtctggt 


gctcaaggta 


cggcacagtg 


ccaacctgca 


cgagttccac 


gcggacctgc 


tgctgtgtag 


ccccatcctc 


tacggcagcc 


tgctcacctt 


ctacaaggtg 


gctggcggct 


tctacctggt 


ctacagtgac 


aagcgcatgg 


acatcgagtg 


ggccagctgc 


acactctgtt 


ctgaggccgt 


caagtccgcc 


gccatcccct 


acgtcatcaa 


ggccgacttc 


aaccccatgc 


gcaacagctt 


caagctcagc 


cgtgtgccgc 


gctgtaccca 


gcctcggggc 


gcctttgctg 


accgctttgt 


cactgccatc 


ttcgtgctgc 


acaaccagct 


gcgggcctcg 


ctgcaggacc 


ggacgggagg 


cagcccccag 


ggctcctttg 


gcaatgacca 


gcgtccccag 


gaggtcccag 


tcccagctcc 


agccccgg 





gcagcagcca gcacatcctc tcctccctgc 2400 
tcaccgagtt tggcttcctc atgccggagc 24 60 
agaacacgct cttcattatc tcggacgccg 2520 
gctcatgctt tgcaccccag cacatggcca 2580 
acaccagcct gcaggagttc ctgcgccagc 264 0 
gccaggagcg cagccagggc tgcttccccg 2700 
tgcagacggc cgccggggac tactcaggca 27 60 
cagccgtgcg gcgctcctgc tgcgcgccct 2820 
actggctgtt gctcacgccc cagcacctca 28 8 0 
ccaaccgtgg cacccacaac tgtcgcaacc 2940 
tctccaccgt gctgctggac cccacacgca 3000 
atggccacgt gctagagctg ctcgtggggt 3060 
cccacgagaa gttccacttc ctgcgcgtct 3120 
tgaagactgt ggtcatcgcc aagacccccg 3180 
cggatggcca gcctgccgag cgcagggcca 324 0 
cagaggctct ggccccggcc ccagtggaag 330 0 
3318 



<210> 4 

<211> 1171 

<212> DNA 

<213> Homo sapiens 

<400> 4 

gaggaggagg aagaggagga tgaagaggcc 
ctgggcgcgg acgaggactt cctgctggag 
ttcctgatcc atgtgcaggg cagtatccgc 
ttcggcatcg cagtcttcga gatcccgcac 
ctctcctccc tgcgctttgt cttttgcttc 
ctcatgccgg agctgtgtct ggtgctcaag 
atctcggacg ccgccaacct gcacgagttc 
cagcacatgg ccatgctgtg tagccccatc 
ttcctgcgcc agctgctcac cttctacaag 



gaggaggagc gcctggctct ggaatgggcc 60 
cacatccgca tcctcaaggt gctgtggtgc 120 
cagttcgccg cctgccttgt gctcaccgac 180 
caggagtctc ggggcagcag ccagcacatc 240 
ccgcatggcg acctcaccga gtttggcttc 300 
gtacggcaca gtgagaacac gctcttcatt 360 
cacgcggacc tgcgctcatg ctttgcaccc 420 
ctctacggca gccacaccag cctgcaggag 480 
gtggctggcg gctgccagga gcgcagccag 54 0 



ggctgcttcc 


ccgtctacct 


ggtctacagt 


gactactcag 


gcaacatcga 


gtgggccagc 


tgctgcgcgc 


cctctgaggc 


cgtcaagtcc 


ccccagcacc 


tcaacgtcat 


caaggccgac 


aactgtcgca 


accgcaacag 


cttcaagctc 


gaccccacac 


gcagctgtac 


ccagcctcgg 


ctgctcgtgg 


ggtaccgctt 


tgtcactgcc 


ttcctgcgcg 


tctacaacca 


gctgcgggcc 


gccaagaccc 


ccgggacggg 


aggcagcccc 


gagcgcaggg 


ccagcaatga 


ccagcgtccc 


gccccagtgg 


aagtcccagc 


tccagccccg 



gacaagcgca tggtgcagac ggccgccggg 600 
tgcacactct gttcagccgt gcggcgctcc 660 
gccgccatcc cctactggct gttgctcacg 720 
ttcaacccca tgcccaaccg tggcacccac 780 
agccgtgtgc cgctctccac cgtgctgctg 84 0 
ggcgcctttg ctgatggcca cgtgctagag 900 
atcttcgtgc tgccccacga gaagttccac 960 
tcgctgcagg acctgaagac tgtggtcatc 1020 
cagggctcct ttgcggatgg ccagcctgcc 1080 
caggaggtcc cagcagaggc tctggccccg 1140 
g 1171 



<210> 5 

<211> 651 

<212> PRT 

<213> Homo sapiens 



<400> 5 



Met Thr Gly Gin 
1 



lie Ala Gly Leu 
20 



Asp Met Ser Val 
35 



Leu Ala Asp Val 
50 



Ser Arg Leu Ser 
65 



Thr Val Ala Pro 



Gin Gly He Met 
100 



Ser Thr Asp Ser 
115 



Ser Asp Ser Leu 
130 



Val Gly Ala Gin 
5 

Thr Leu Val Arg 



Gin Val He Arg 
40 

Arg Arg Asn Ser 
55 

Ala Ala Pro Cys 
70 

Ala Ser Ala Ser 
85 

Phe Val Gin Glu 



Leu Thr Pro Glu 
120 

Glu Ser He Pro 
135 



Thr Val Ser Gly 
10 

Pro Leu Arg Ser 
25 

Pro Gly Glu Ala 



Pro Glu Lys Lys 
60 

lie Arg Pro Ser 
75 

Leu Pro Gin Pro 
90 

Glu Ala Leu Ala 
105 

His Gin Pro He 



Ala Gly Gin Ala 
140 



Gly Lys Arg Ser 
15 

Val His Leu Leu 
30 

Phe Pro Thr Ala 
45 

Gly Gly Glu Asp 



Ser Ser Pro Pro 
80 

He Leu Ser Asn 
95 

Ser Ser Leu Ser 
110 

Ala Gin Gly Cys 
125 

Ala Ser Asp Asp 



Leu Arg Asp Val Pro Gly Ala Val Gly Gly Ala Ser Pro Glu His Ala 
145 150 155 160 



Glu Pro Glu Val Gin Val Val Pro Gly Ser Gly Gin lie lie Phe Leu 
165 170 175 

Pro Phe Thr Cys He Gly Tyr Thr Ala Thr Asn Gin Asp Phe He Gin 
180 185 190 

Arg Leu Ser Thr Leu lie Arg Gin Ala lie Glu Arg Gin Leu Pro Ala 
195 200 205 

Trp He Glu Ala Ala Asn Gin Arg Glu Glu Gly Gin Gly Glu Gin Gly 
210 215 220 

Glu Glu Glu Asp Glu Glu Glu Glu Glu Glu Glu Asp Val Ala Glu Asn 
225 230 235 240 

Arg Tyr Phe Glu Met Gly Pro Pro Asp Val Glu Glu Glu Glu Gly Gly 
245 250 255 

Gly Gin Gly Glu Glu Glu Glu Glu Glu Glu Glu Asp Glu Glu Ala Glu 
260 265 270 

Glu Glu Arg Leu Ala Leu Glu Trp Ala Leu Gly Ala Asp Glu Asp Phe 
275 280 285 

Leu Leu Glu His He Arg He Leu Lys Val Leu Trp Cys Phe Leu He 
290 295 300 

His Val Gin Gly Ser lie Arg Gin Phe Ala Ala Cys Leu Val Leu Thr 
305 310 315 320 

Asp Phe Gly He Ala Val Phe Glu He Pro His Gin Glu Ser Arg Gly 
325 330 335 

Ser Ser Gin His He Leu Ser Ser Leu Arg Phe Val Phe Cys Phe Pro 
340 345 350 

His Gly Asp Leu Thr Glu Phe Gly Phe Leu Met Pro Glu Leu Cys Leu 

355 . 360 365 

Val Leu Lys Val Arg His Ser Glu Asn Thr Leu Phe He He Ser Asp 
370 375 380 

Ala Ala Asn Leu His Glu Phe His Ala Asp Leu Arg Ser Cys Phe Ala 
385 390 395 400 

Pro Gin His Met Ala Met Leu Cys Ser Pro He Leu Tyr Gly Ser His 
405 410 415 

Thr Ser Leu Gin Glu Phe Leu Arg Gin Leu Leu Thr Phe Tyr Lys Val 
420 425 430 

Ala Gly Gly Cys Gin Glu Arg Ser Gin Gly Cys Phe Pro Val Tyr Leu 
435 440 445 

Val Tyr Ser Asp Lys Arg Met Val Gin Thr Ala Ala Gly Asp Tyr Ser 
450 455 460 

Gly Asn He Glu Trp Ala Ser Cys Thr Leu Cys Ser Ala Val Arg Arg 
465 470 475 480 

Ser Cys Cys Ala Pro Ser Glu Ala Val Lys Ser Ala Ala He Pro Tyr 
485 490 495 



-10- 



Trp Leu Leu Leu 
500 



Asn Pro Met Pro 
515 

Phe Lys Leu Ser 
530 



Arg Ser Cys Thr 
545 



Glu Leu Leu Val 



His Glu Lys Phe 
580 

Leu Gin Asp Leu 
595 



Gly Ser Pro Gin 
610 

Ala Ser Asn Asp 
625 

Pro Ala Pro Val 



Thr Pro Gin His 



Asn Arg Gly Thr 
520 



Arg Val Pro Leu 
535 



Gin Pro Arg Gly 
550 

Gly Tyr Arg Phe 
565 

His Phe Leu Arg 



Lys Thr Val Val 
600 



Gly Ser Phe Ala 
615 

Gin Arg Pro Gin 
630 

Glu Val Pro Ala 
645 



Leu Asn Val lie 
505 

His Asn Cys Arg 



Ser Thr Val Leu 
540 

Ala Phe Ala Asp 
555 

Val Thr Ala He 
570 

Val Tyr Asn Gin 
585 

He Ala Lys Thr 



Asp Gly Gin Pro 
620 

Glu Val Pro Ala 
635 

Pro Ala Pro 
650 



Lys Ala Asp Phe 
510 



Asn Arg Asn Ser 
525 



Leu Asp Pro Thr 



Gly His Val Leu 
560 

Phe Val Leu Pro 
575 

Leu Arg Ala Ser 
590 



Pro Gly Thr Gly 
605 



Ala Glu Arg Arg 



Glu Ala Leu Ala 
640 



<210> 6 
<211> 390 
<212> PRT 

<213> Homo sapiens 



<400> 6 

Glu Glu Glu Glu Glu Glu Asp Glu Glu Ala Glu Glu Glu Arg Leu Ala 
15 10 15 

Leu Glu Trp Ala Leu Gly Ala Asp Glu Asp Phe Leu Leu Glu His He 
20 25 30 

Arg lie Leu Lys Val Leu Trp Cys Phe Leu He His Val Gin Gly Ser 
35 40 45 

He Arg Gin Phe Ala Ala Cys Leu Val Leu Thr Asp Phe Gly He Ala 



Val Phe Glu He Pro His Gin Glu Ser Arg Gly Ser Ser Gin His He 
65 70 75 80 

Leu Ser Ser Leu Arg Phe Val Phe Cys Phe Pro His Gly Asp Leu Thr 



Glu Phe Gly Phe Leu Met Pro Glu Leu Cys Leu Val Leu Lys Val Arg 
100 105 110 



-11- 



His Ser Glu Asn Thr Leu Phe lie lie Ser Asp Ala Ala Asn Leu His 
115 120 125 



Glu Phe His Ala Asp Leu Arg Ser Cys Phe Ala Pro Gin His Met Ala 
130 135 140 

Met Leu Cys Ser Pro lie Leu Tyr Gly Ser His Thr Ser Leu Gin Glu 
145 150 155 160 

Phe Leu Arg Gin Leu Leu Thr Phe Tyr Lys Val Ala Gly Gly Cys Gin 
165 170 175 



Glu Arg Ser Gin Gly Cys Phe Pro 
180 

Arg Met Val Gin Thr Ala Ala Gly 
195 200 

Ala Ser Cys Thr Leu Cys Ser Ala 
210 215 

Ser Glu Ala Val Lys Ser Ala Ala 
225 230 

Pro Gin His Leu Asn Val lie Lys 
245 



Val Tyr Leu Val Tyr Ser Asp Lys 
185 190 

Asp Tyr Ser Gly Asn lie Glu Trp 
205 

Val Arg Arg Ser Cys Cys Ala Pro 
220 

lie Pro Tyr Trp Leu Leu Leu Thr 
235 240 

Ala Asp Phe Asn Pro Met Pro Asn 
250 255 



Arg Gly Thr His Asn Cys Arg Asn 
260 



Val Pro Leu Ser Thr Val Leu Leu 
275 280 



Pro Arg Gly Ala Phe Ala Asp Gly 
290 295 



Tyr Arg Phe Val Thr Ala lie Phe 
305 310 



Arg Asn Ser Phe Lys Leu Ser Arg 
265 270 

Asp Pro Thr Arg Ser Cys Thr Gin 
285 

His Val Leu Glu Leu Leu Val Gly 
300 

Val Leu Pro His Glu Lys Phe His 

315 320 



Phe Leu Arg Val Tyr Asn Gin Leu 
325 

Thr Val Val lie Ala Lys Thr Pro 
340 

Ser Phe Ala Asp Gly Gin Pro Ala 

355 360 

Arg Pro Gin Glu Val Pro Ala Glu 
370 375 

Val Pro Ala Pro Ala Pro 
385 390 



Arg Ala Ser Leu Gin Asp Leu Lys 
330 335 



Gly Thr Gly Gly Ser Pro Gin Gly 
345 350 



Glu Arg Arg Ala Ser Asn Asp Gin 
365 



Ala Leu Ala Pro Ala Pro Val Glu 
380 
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<210> 7 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 7 

cttgaggatg cggatgtgct 



<210> 8 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 8 

ccatggggtg agtgtcct 



<210> 9 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 9 

aggacactca ccccatgg 



<210> 10 
<211> 20 
<212> DNA 
<213> Homo 

<400> 10 

gtatgggaca 



sapiens 



ggggcagaaa 



<210> 11 

<211> 20 

<212> DNA 

<213> Homo sapiens 



<400> 11 

tttctaaaga ccattgggag 



<212> DNA 
<213> Homo 

<400> 12 

ccattttaaa 



sapiens 



gtagcggttc 



<210> 13 

<211> 20 

<212> DNA 

<213> Homo sapiens 



<400> 13 

aggagagaaa ggtgagccaa 



<210> 14 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 14 

gtagatcctg aggttgacca 



<210> 15 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 15 

tgtgagcatt tctggccttc 



<210> 16 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 16 



tgaagacgcc agagaagcag 



<210> 17 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 17 

gcctcacaag tgtcagacct 



<210> 18 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 18 

agaagggtgg tgaagact 



<210> 19 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 19 

cttggttaga gaggatgggc 



<210> 20 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 20 

gcccatcctc tctaaccaa 



<210> 21 

<211> 15202 

<212> DNA 

<213> Homo sapiens 

<400> 21 

gatccgagct caattaaccc tcactaaagg gagtcgactc gatccttaaa atattcatat 60 

ctcctggaca acctgtggcc atagtgcctg actgtaaacc caaagggttt gcctttgcca 120 

gtgtagccca gcctggtgtc tgctgcccct cgcggtgtct gtgcacctgc cacgatgctg 180 



-15- 



accagacacc cttaaccagg ttcacccatc 
ctgattggtc cttggacctt ctgttctccc 
ctatttgtgt cccacctccc tctttgtggc 
agaccatgag atgccatctc ctcccctcct 
acagaggtgt cagctcctgt ccaggccctt 
aaccccccac tgcctgccag ccaccactcc 
tctcctgccg ctggttcccc tacccaggag 
tcccacctct aactgggggc ttagttcaag 
agagcttgct gcctcctgtt cttggtgagg 
gacccctcca aggcattccc aggtcacttc 
gctgagcctc cctcagccta tggtgtcctc 
cctcctcctc agcagagcct accatcctcc 
ccatcaccag tccccatgct gatagccatc 
gatgctctag gtctgtctgg atgacacagt 
agtgctgaaa gcaattcaga aagccaagga 
gaaggtgggt ttgtgtggca ggtgggaggg 
cagttcgggg ggcttgggcc atgggactgc 
ccctggctgc atgacctcgg gcaagtcgcg 
tggtagttcc tgctctaagg atatgatgag 
gttgcgctcc ctcctcacac ccctggcctt 
ggctgactca agcactcgtc ctcagggtgg 
catcagaccc agcagctccc ctcccactgt 
catcctctct aaccaaggta atcgtgtatg 
tgcctgggcc ccctggctgg gctggggttg 
gggtcagaca cagggagggc agtgccttct 
caggggtgcc tggcctgatg ccagctgttg 
atgcccacat ccagctcctc taggagaccg 
tctgaacagg ctcggggctg ttggctcatg 
ggttggctcc tggttacagg aagccgggct 
tgtgcaggga aaagcttgct tttatcactg 
aacaccatgt ttgtggggcc aagatgggcc 



gcctgggcct ggagcagtcc ccctgatgct 240 
aaaatcccag gtcagaaaat acctggaagt 300 
cgcaagtgcc ccttcctcca cacagtcaca 360 
gggctgcaga ctttgggaag ctcccaggcc 420 
gggaccttcc ctcattcaac caccctaccc 480 
ctcccacatt tgcaggcggg ggccctgccc 54 0 
gctctcccat cgctcttttg agagtctgcc 600 
ttgccccctt accctagtcc cagctgccca 660 
gactccagag acagatgtga gacctccctg 720 
catgagtagt gaagaaccgc ctctgagcag 780 
acgtggcttg gcccacagca ggtgctcacg 840 
tgccatgctc accagtcccc atgctgatag 900 
accagtcccc atgctgatag ccactttctg 960 
gaccacagag aaggagctgg acactgtgga 1020 
ggtcaagtcc aaactgagca acccagagaa 1080 
cagtggtgca gagccagccg ggataggagc 114 0 
tcagggctgc cgagtcccag ctgcgcccct 1200 
gcctctctgt tctctgtggg gtggggacag 1260 
accatcttta ccacccagtt ggtgggaacc 1320 
ggggagctct gtgcttcctc ttctctcccg 1380 
tgaagactcc cggctctcag ctgccccctg 1440 
ggctcccgca tctgcctccc tgccccagcc 1500 
tatcttgctt ctagtggagc cacacagccc 1560 
ggggagaggt gccagcacct gcttccaaca 1620 
gcaggctggt cctcgcgggg ggacacatgg 168 0 
cttgcttggt gaggactccc aattgctctg 1740 
cagggtgtct gacaggccct gaggctgccc 1800 
ggacccattc cctcaccggc agcacaagca 1860 
tgtgacttta ctgtctggag cccgaatccc 1920 
cctcatctct gtggggtgac ccagccccag 1980 
atctctgtcc ctgtggaccc atggaagacc 204 0 



-16- 



aggcccattc 


gtctgcccac 


tatcttagcg 


ttttcaaagg 


gctttcacct 


ctgaacccag 


2100 


gcatcctcgg 


agatgagtga 


gtgaagcagg 


tctcatgagc 


gtgtctgctg 


gcccggcccc 


2160 


cacggaagag 


gggagggtgt 


gccgtcccga 


gtggagccga 


ggctcgggac 


acgcaggaaa 


2220 


ggacgccgcc 


tgcccgggct 


cctggagacg 


cagaacttgg 


tgtgaggtct 


tgggaaaaca 


2280 


gttcaacccg 


atgttttaag 


agccagaaaa 


acattcccac 


cccttgacct 


ggtaacccca 


2340 


ctggtgggga 


ttttctctta 


gagggataag 


ataccgggaa 


ggggaggtga 


aatgctcacc 


2400 


actgccaaaa 


cacgggctgc 


aactgcaaca 


tcggaggatg 


agagggagag 


tcggctgtgg 


2460 


tgcagaatgc 


tcagcagccc 


tcccagcagg 


gacaggaaga 


ctgggcagga 


agaggggaga 


2520 


agcattcaag 


ttaaggcaaa 


aggcccaacg 


cagagcagca 


cactgaggtc 


acacctgtga 


2580 


gatgtggaag 


agaattcctg 


agcgtggagc 


gatggggtta 


ggtgccagga 


tgattgccca 


2640 


ttttgcttct 


gtcagactct 


tgactaagga 


tttctggttg 


cattttatta 


cataaaagcc 


2700 


agggaggtta 


tatcacggtg 


agaaagcttc 


cctgacgccg 


cctcctgtag 


cgcagccaag 


2760 


cgagcctgtg 


gaggtaccat 


atgactgtag 


gcctctgggg 


acagggagct 


gcatctgctt 


2820 


ctcaaggcca 


gggacacagc 


catttctgcc 


agcatctgtt 


gatcagtgag 


tgagtgagtg 


2880 


ggcaggtaga 


gcaggagcca 


gtgaagagca 


ggccctggat 


gggtggggat 


gcaccatgtc 


2940 


cccaggctgc 


agctgcaggc 


agccccccac 


attgtcggag 


aagcctctgc 


accagctcag 


3000 


ccccctcctc 


actccccttg 


tgccctgggg 


acactctgca 


gaggggcact 


ctgcagtctg 


3060 


tccccgccat 


cgctggactt 


ctggacatgg 


cctccagatt 


tgcacctctt 


aaataaatct 


3120 


gcagtggatg 


tctttgtgtg 


cacctctctt 


tccttttggt 


gagaaacagc 


aaagatcgga 


3180 


cccctaagga 


ctctcctgat 


gtctccgctc 


tatccgctga 


gtgccctttc 


tgaccacttg 


3240 


tttgtacagg 


ccacggtcca 


ggacgggagc 


agatagactg 


tccctgtccc 


tgtccacatt 


3300 


tccttggtcc 


aaacagggct 


tgtgggaggt 


agtggcaaaa 


ggtgttggtc 


tttttctcac 


3360 


tgatttggag 


gcctccccgt 


gtgttttttc 


agccgcgtgt 


tcctgggtct 


tgcctggatg 


3420 


gacagggttt 


tttagcgcgt 


gggagcagct 


ttgctgacca 


tgcctgttgc 


ttccagcctg 


3480 


attcccgaga 


agggagcgtg 


cttgcgaagg 


aactggcact 


cgggcctgcc 


tgaagggggc 


3540 


gctgtccaga 


cacacccagc 


ctcccgtcgt 


ggcaggcgct 


gtcggagcca 


tggatgattg 


3600 


tgaccaatag 


gggtggtcgc 


cagagttgat 


tgtccagcca 


ggcccagggg 


ctgagaggag 


3660 


gctgtgtgga 


gaggtggtta 


ggagccaggg 


ctcggtcagc 


tgagttcgca 


tgccagcttc 


3720 


ctagctgtgg 


gacctcaagc 


aacttgtagc 


ccctctgaag 


ctgttttctc 


aactgtgaag 


3780 


tggacgcacc 


ctacttcatt 


gattctaaga 


ggcacgcatt 


tccaccttgt 


gacttctctg 


3840 


aaactgaggt 


gcgtctttca 


gtcagtggcg 


tctcatagtc 


gctgtcagcc 


agctggtatt 


3900 
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cgagatggag tcgtggaaaa cccgtggaca ccttccgcta ggaccaagat ggcgccacct 3960 
gccgcatctt agatttgatg aaatgtggta aataacgaga ggcatgcatg agcgaatgct 4020 
ggggaggcgc ttggcactac ccagagctcc acagaggtgg tcgatgaggg ctgccctttc 4080 
ccacatcctt agtagggggt tcaacatgac ccagactgtg cccctgggga gcttggagcc 4140 
atgcgggagg atgagccatg tgctggagga gaacagggta ggatggtgtg gggcttttgt 4200 
agactgtcta gagcagagaa ggtctgcagt ggaggtggtg tctgaggtga atctcgaagg 4260 
tgaataggag ttgaacgtta gcaggcagag ggtggattgc aggagagcag cggcctgggc 4320 
aggtgcccag cgtggcccat cagggtgctt catgcatggc tgtgtgcttg ccatccttcc 4380 
tgcctgccta ccccctgctg cttcgcttca tgggggcgtt tgagcttggg cccacctgcc 4440 
tgcctcgctt gtgggcagag gacccaggct gtgtgagttg tcctgtcccg gggagcagct 4500 
gagcttgtcc gggttcctcg acctgtgggg cttcagagga cttcgggtca tttcaatggg 4560 
ctgtggcgat gctggctgtg gaggtagcct agggctcctg tagccttcag tgagactggc 4 620 
ggcccgatgc ccagtgttca ccctgctggc ggcagtcagg aacatgttca caaagcttta 4680 
cttcaagtgg tctagaggtg atctgaggtg gagtaacagg tccagatagg ctacgttcat 4740 
aaaacagctt cagcggggtt taggaacact gtgcatttac gggacgcagt gggtcagagt 4800 
gctgctgtcc gtgggaggtg gccccagggc aggtcagtgg gcacgtcctg tggtaagtgg 48 60 
gactgtggat gtgggctcag gctggactca gcagccctgc tggataccaa ggcctgcaag 4920 
ggctggcccc ctggtgaatt gtcccgtgcc ctgtgtatct atgagtcctg cagagatgac 4 98 0 
aaatcagggg acggggtcat gtctagtcac cgtctgggaa aatgctccag gagtgaacac 5040 
atttcaggct cttgatggat gtacctccaa actcttctct ggatgggtgg gccagcttgc 5100 
atgcctgtgc cggcctctgc ccagcgaggt cagggccagg ccacacagtc agtctgactt 5160 
tggcagaagt tgagaggcaa cacttgtctc ttgtttcagc ttgcctttct ttgtgtactt 5220 
ctgagagcga gcattctttt catgttctat ccgctggccg ttcttctgcg gaatgtctgt 5280 
tcacgtcctt tgcagtctgt taatgaggtt tccaaccttc cctcattttt gtaatctgta 5340 
agaacttttt ccagactagc gatataaatc cttgtcaaat attgcaaaca cttttctcat 5400 
ttcatctggt tttaatctat cctggttttt aaaaaatgtg tctgtggaag tttaattttt 54 60 
atgtagtcac atctcagttt ttttccattg catttattct cagaatgctt ctccctgccc 5520 
tgagattaga taagcagtca tttgttcttt cttgagttat tttgagattt cagttttaac 5580 
attttcttct ataatccatg tggctgggtt ttgggatctg gctaaccccc gccatgccag 5640 
tagcctgagg ggcccagccc cacttgttga acagccgctc tccccgcccc acccaccctg 5700 
cctgcctgcc cacccgccct ggtctctcca ggaatcatgt tcgttcagga ggaggccctg 57 60 
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gccagcagcc tctcgtccac tgacagtctg 
tgttctgatt ccttggagtc catccctgcg 
ggaccataca tctgtgggtg gactcttctg 
agctggagct gaggcagatg cttccagggt 
tctcttggac ctgtctctgg ttgagtgtct 
tttcagcctc cctccttccc ttccccaccc 
gtcactgggc ccagggcacg aggggggcag 
aggcttccct ggggaaggca tttcaaaaga 
ctctgtttcc tggcacccct ggagccactc 
tgggaaacag tctcactctg gcgcctcctc 
ttcctttgtc ctgaggaaag acaggaggaa 
tgtgcttggt gcctgggcct ccctccagcc 
ttgtgacact gggacagttt gcagagtcct 
ccatcaccct ttccagggtc acacagcaag 
cacagagacc attgggaggg acttgccagg 
gaccaaattt gtagactgtc tacctggacc 
ggatccctgg agagtggcga gaggctctgg 
gctgtgtgct ggtgggataa ccaagtgggt 
gggtcccaga gtgggctcca gggtacagcg 
tggagggcag aatgcccagc tcagggtctg 
gtaggtgggg actgactgtg tttctttctc 
gtgccaggag ctgttggtgg tgcaaggtaa 
ccacacagcc ttatgcacac acactgctgt 
aaaatccgtt cacagaaggc ctatagaact 
tggacttttc aatctgtttc caaattctaa 
tacaccaggg gttggcaaat caaggcctgt 
acagttacat tcttttttct ttttttgaga 
tgcagtggcg tgttcttggc tcactgcaac 
gtctcagcct cagccttctg agtagcccgg 
attttttata tttttagtag agacagagat 
ctgaactcca gtgatccacc aacctcggct 



actcccgagc accagcccat tgcccaggga 5820 
ggacaggtaa tgccctcttc ccgcttctgg 5880 
cttggggttg tgtgcagtag gaagtggcct 5940 
ttggcgtcct ctgctttgcg ccacggtctt 6000 
tcctgacaaa cacagtggtt aagggtttat 6060 
accttggttg atgggaacag gcagttctct 6120 
gtggagaggg tggcccttga ccctgtgagc 6180 
ccctcgtgca ggggcttgtt tgggtttctt 6240 
ggcgcctttc cgcatgtcac cctggtggtc 6300 
tgtggttgtt actgagagtt ctggggcccc 63 60 
agcaagggtg cttgctgtgt gcttcgcaaa 6420 
ccatctctgc agcagcacaa ggttatggcc 6480 
tgtctgtcct cagtactcca cagtattctg 6540 
agattcccaa gccctaggta ttccccagtg 6600 
gctgtgtcca ctgctggcca gttagggtcg 6660 
cttgcgtggc acaaggagca gtcagatgct 6720 
ccttaggttg cgagtgggaa tcccagccct 67 8 0 
ctctgccctt gggtcccaga gtgggcccca 6840 
tggggatggg gagcctcctc agggcggtga 690 0 
gcaaccagta aatggctggg gctggctgca 6960 
catcaggcag cttccgatga tttaagggac 7020 
ggaagaggtt ggaaagggac ctgggcctgg 7080 
gggccagggg tggccagtca ggttttttta 7140 
atttcttcct ctaaagagac acagatgaga 7200 
tacctaaact ctgctcagca catgttgccc 7260 
gtgtggccca cagcctggga gctaagaatg 7320 
ctgagtctcg ctctgtcgcc caggctggag 738 0 
ccccgcctcc cagattaatg caattttcct 7440 
accacaggcg cacgccacca cgcccaacta 7500 
tcaccatgtg gcctagctgg tctcgaactc 7560 
tcctaaagta ctggaattac aggcatgagc 7 620 
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caccgcgcct 


ggctagaata 


acagttactt 


tttttttctt 


tgagactgag 


tcttgctttg 


7680 


tcacccaggc 


tggagtgcag 


tggcacgatc 


tcagctcgct 


gcaacctccg 


cctcccgggt 


7740 


tcaagcgatt 


cttctgcctc 


agccacccaa 


ggtgcccgcc 


accacacctg 


gctaattttt 


7800 


ctgtttttag 


tagggacagg 


atttcgccat 


gttggacagt 


tacattctta 


aagggctgct 


7860 


gaagatcgta 


tggacatggt 


agcccataaa 


tcccaaaatg 


tgtactctga 


ccctttacag 


7920 


aagcttacta 


actcccactc 


tacatgtgag 


ggctgcggtg 


gccaagaaga 


gctggaattt 


7980 


aagtgtgaag 


gtcctaagac 


ctgccccagc 


ccacttccct 


gccccggagg 


ccaccagggg 


8040 


tgacaagtag 


attcatgccc 


tggagtgttc 


cttctctccg 


gggcttatgg 


cagcaactga 


8100 


atgacttaga 


agtccatggg 


agtgctttct 


gttgtgggaa 


ctcgtgtggt 


ctgggcatag 


8160 


ctgtgccagg 


cacctatggt 


ccaagcccct 


agaagcatag 


actctgacca 


aactggcgac 


8220 


ccagccttcc 


agcaggcagc 


actggctccc 


accagggccc 


tcatcctggg 


aactgacttg 


8280 


gccatgtggg 


aggcttggga 


gacccatggg 


ttggtttctc 


agggtcaggg 


tgtagcagtg 


8340 


ggctccagat 


gtggcaggtg 


ggaggtggga 


ggggcccctc 


ccagcatgcc 


actgacctgg 


8400 


cctctccctg 


cacagcccag 


aacatgccga 


gccggaggtc 


caggtggtgc 


cggggtctgg 


8460 


ccagatcatc 


ttcctgccct 


tcacctgcat 


tggctacacg 


gccaccaatc 


aggacttcat 


8520 


ccagcgcctg 


agcacactga 


tccggcaggc 


catcgagcgg 


cagctgcctg 


cctggatcga 


8580 


ggctgccaac 


cagcgggagg 


agggccaggg 


tgaacagggc 


gaggaggagg 


atgaggagga 


8640 


ggaagaagag 


gaggacgtgg 


ctgagaaccg 


ctactttgaa 


atggggcccc 


cagacgtgga 


8700 


ggaggaggag 


ggaggaggcc 


agggggagga 


agaggaggag 


gaagaggagg 


atgaagaggc 


8760 


cgaggaggag 


cgcctggctc 


tggaatgggc 


cctgggcgcg 


gacgaggact 


tcctgctgga 


8820 


gcacatccgc 


atcctcaagg 


tgctgtggtg 


cttcctgatc 


catgtgcagg 


gcagtatccg 


8880 


ccagttcgcc 


gcctgccttg 


tgctcaccga 


cttcggcatc 


gcagtcttcg 


agatcccgca 


8940 


ccaggagtct 


cggggcagca 


gccagcacat 


cctctcctcc 


ctgcgctttg 


tcttttgctt 


9000 


cccgcatggc 


gacctcaccg 


agtttggctt 


cctcatgccg 


gagctgtgtc 


tggtgctcaa 


9060 


ggtacggcac 


agtgagaaca 


cgctcttcat 


tatctcggac 


gccgccaacc 


tgcacgagtt 


9120 


ccacgcggac 


ctgcgctcat 


gctttgcacc 


ccagcacatg 


gccatgctgt 


gtagccccat 


9180 


cctctacggc 


agccacacca 


gcctgcagga 


gttcctgcgc 


cagctgctca 


ccttctacaa 


9240 


ggtggctggc 


ggctgccagg 


agcgcagcca 


gggctgcttc 


cccgtctacc 


tggtctacag 


9300 


tgacaagcgc 


atggtgcaga 


cggccgccgg 


ggactactca 


ggcaacatcg 


agtgggccag 


9360 


ctgcacactc 


tgttcagccg 


tgcggcgctc 


ctgctgcgcg 


ccctctgagg 


ccgtcaagtc 


9420 


cgccgccatc 


ccctactggc 


tgttgctcac 


gccccagcac 


ctcaacgtca 


tcaaggccga 


9480 
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cttcaacccc 


atgcccaacc 


gtggcaccca 


caactgtcgc 


aaccgcaaca 


gcttcaagct 


9540 


cagccgtgtg 


ccgctctcca 


ccgtgctgct 


ggaccccaca 


cgcagctgta 


cccagcctcg 


9600 


gggcgccttt 


gctgatggcc 


acgtgctaga 


gctgctcgtg 


gggtaccgct 


ttgtcactgc 


9660 


catcttcgtg 


ctgccccacg 


agaagttcca 


cttcctgcgc 


gtctacaacc 


agctgcgggc 


9720 


ctcgctgcag 


gacctgaaga 


ctgtggtcat 


cgccaagacc 


cccgggacgg 


gaggcagccc 


9780 


ccagggctcc 


tttgcggatg 


gccagcctgc 


cgagcgcagg 


gccaggtgag 


atcaagcaca 


9840 


gctctcaggg 


gccccggggg 


cacgggtctg 


gcatgtgtgt 


gatctcagca 


tctgcggcta 


9900 


gtgtgggctg 


ggagttgctg 


cgagagctgg 


gccccctccc 


ccctgcccct 


cgcccccccc 


9960 


gggcctccct 


ctacatcacc 


accccaggtt 


tggtgccagg 


ctgctcctta 


tctcagtgct 


10020 


gtagaagaag 


cccaggaaag 


ctgtcctctc 


acaaaatggg 


ttggcccagc 


ctcttgccac 


10080 


ccatgaaggg 


caggccaagg 


gggctgcccc 


acctttgcct 


gcccagtggg 


agagcaacag 


10140 


gctgcagcac 


accgaggcca 


ggagagctgt 


caccctggct 


gctgtgctcc 


tctgggccca 


10200 


agcatggcct 


ctgggcacta 


cctcctccag 


ggtcacagtc 


ccacggatgg 


ctctgtgggc 


10260 


caggatctgc 


cttaggcttc 


acccacctca 


acatcttgct 


gtgttgttca 


ggctggtctc 


10320 


aaactttggg 


ctcaaacaat 


cctccgcctc 


agcctcccaa 


agtgctggga 


ttacagacat 


10380 


gagccaccgt 


gcccggccgt 


gctgttctgt 


tctccaatag 


agaagctggt 


ggaagtcccc 


10440 


agtaacccag 


aggtgatgtg 


tgatgcacac 


agtctcctca 


ctctgaagct 


gcacatgcga 


10500 


tgtgaatctt 


catttggggt 


ccgctgttaa 


tatggtgttt 


ttcgggggat 


acagcaatga 


10560 


ccagcgtccc 


caggaggtcc 


cagcagaggc 


tctggccccg 


gccccagtgg 


aagtcccagc 


10620 


tccagcccct 


gcagcagcct 


cagcctcagg 


cccagcgaag 


actccggccc 


cagcagaggc 


10680 


ctcaacttca 


gctttggtcc 


cagaggagac 


gccagtggaa 


gctccagccc 


cacccccagc 


10740 


cgaggcccct 


gcccagtacc 


cgagtgagca 


cctcatccag 


gccacctcgg 


aggagaatca 


10800 


gatcccctcg 


cacttgcctg 


cctgcccgtc 


gctccggcac 


gtcgccagcc 


tgcggggcag 


10860 


cgccatcatc 


gagctcttcc 


acagcagcat 


tgctgaggta 


gcggcccggg 


tgtgggtgcc 


10920 


agctatggca 


cggccagtcc 


tgagggcgag 


gccaagcttg 


gcttcaggtc 


agcctcaggt 


10980 


ccctggactt 


ccctgatgtc 


ggagtcctca 


gctgagctgc 


tcacagcttt 


gaggacctgg 


11040 


gcagtgaggt 


cctgagttgc 


cctccctggc 


catttgtgct 


gtgtcaccac 


ctcctgtgcc 


11100 


acttccagcc 


ccaggtagac 


ctcccaccaa 


cagccatctc 


ccacccctct 


cttcctctct 


11160 


gccttgaagc 


atacggattc 


attggtgagc 


caagaggggc 


ttcccatgtc 


tccttgtgga 


11220 


agctgtgggc 


atgtccctgg 


tatgtgcagg 


ttgctagggt 


ggtggagctg 


acaggaggcc 


11280 


ccccgtcttc 


aggttgaaaa 


cgaggagctg 


aggcacctca 


tgtggtcctc 


ggtggtgttc 


11340 
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taccagaccc 


cagggctgga 


ggtgactgcc 


tgcgtgctgc 


tctccaccaa 


ggctgtgtac 


11400 


tttgtgctcc 


acgacggcct 


ccgccgctac 


ttctcagagc 


cactgcaggg 


taggcacagg 


11460 


gcctgctggg 


gctcaggagc 


ttggagtgtg 


tggttggggc 


aggcctgggg 


ggtcattctc 


11520 


tggagccagc 


tgtgtggctt 


caggcagcag 


tcagcgactt 


ggctgcagtg 


ggctgagagt 


11580 


tccttgtctg 


aggaagggag 


ctgtcatgag 


ggaggggtcc 


atggccagat 


gtgaacgcag 


11640 


aatgcactga 


gccagggcct 


ggtgactgct 


tgggaacagc 


ctgtgatgag 


aaggggttag 


11700 


gcagcctttg 


cccctggggc 


tgcacaggaa 


gccctagcca 


gcgacctggt 


gactcccctg 


11760 


agctggaaga 


ggctcagact 


ccagagggca 


ttgcctatgg 


ggctttgcac 


gggtggaagc 


11820 


caggccagcc 


aagaggacct 


gttcctgctg 


gatgtgctgc 


acacctagga 


accttgtgct 


11880 


tgcctgccac 


cgcctccctc 


tgtccctttc 


tccatcacac 


agatttctgg 


catcagaaaa 


11940 


acaccgacta 


caacaacagc 


cctttccaca 


tctcccagtg 


cttcgtgcta 


aagcttagtg 


12000 


acctgcagtc 


agtcaatgtg 


gggcttttcg 


accagcattt 


ccggctgacg 


cgtgggtgac 


12060 


cctctgtgct 


ttgtcctatt 


tcgggtgaag 


gccagcatca 


ccagtgggct 


tccaccttcc 


12120 


gtacgtgggt 


gggttatcat 


agacagttat 


ctctgtgctc 


aagagccact 


tcttacccgg 


12180 


ggtgggagga 


agcagcttca 


ggaactgctg 


agagagcaga 


actcacgctc 


cagggctcag 


12240 


agcaggaggt 


agggtgtgcg 


gcaagcgctg 


gcccggacag 


aagcagagtg 


ggccctggtc 


12300 


tcgggcagga 


tgtttctgac 


tcacatttcc 


tgaggagaga 


aagctaagct 


ctttgcctaa 


12360 


tgtctctgtc 


tccccttcca 


gaaaaatgcc 


tcagctcttc 


cggcctgaag 


gaatggcctc 


12420 


ctcccgggcc 


ccatgattct 


ttcctgtgtg 


ggccctcctg 


gccctggcct 


ctgggctgag 


12480 


gcttgctagg 


gactcggggt 


ggctctaagg 


ggcagggata 


gggctgggga 


gcgccggcct 


12540 


gtggccctga 


ccagcccctt 


ctcgtgcagg 


ttccaccccg 


atgcaggtgg 


tcacgtgctt 


12600 


gacgcgggac 


agctacctga 


cgcactgctt 


cctccagcac 


ctcatggtcg 


tgctgtcctc 


12660 


tctggaacgc 


acgccctcgc 


cggagcctgt 


tgacaaggac 


ttctactccg 


agtttgggaa 


12720 


caagaccaca 


ggtacccctg 


tctagctcag 


gctgcagaca 


ggctgcctgg 


acagacgtca 


12780 


tgggccccag 


ggtggctctc 


tgtgccccag 


aaccctctct 


gcctctatgt 


ctctcttttc 


12840 


tcacttagct 


ggccagggtt 


ttatgtgggg 


cttttcgatg 


gcagagtctc 


cactccagca 


12900 


gtccctcaac 


catctggcag 


acacatctcc 


agtgcctgct 


ttgggctcct 


ggcctgtggg 


12960 


ccccacactt 


ggagcatcct 


ctcctgcctg 


tctcatgccg 


gggtctctcg 


gttggcttgg 


13020 


ggcccttggt 


gctcccagcc 


ccaccagggg 


ccggttccag 


gctatagccc 


aggtggcatc 


13080 


tctctgcagg 


gaagatggag 


aactacgagc 


tgatccactc 


tagtcgcgtc 


aagtttacct 


13140 


accccagtga 


ggaggagatt 


ggggacctga 


cgttcactgt 


ggcccaaaag 


atggctgagc 


13200 



-22- 



cagagaaggc 


cccagccctc 


agcatcctgc 


tgtacgtgca 


ggccttccag 


gtgggcatgc 


13260 


caccccctgg 


gtgctgcagg 


ggccccctgc 


gccccaagac 


actcctgctc 


accagctccg 


13320 


agatcttcct 


cctggatgag 


gactgtgtcc 


actacccact 


gcccgagttt 


gccaaagagc 


13380 


cgccgcagag 


agacaggtac 


cggctggacg 


atggccgccg 


cgtccgggac 


ctggaccgag 


13440 


tgctcatggg 


ctaccagacc 


tacccgcagg 


ccctcaccct 


cgtcttcgat 


gacgtgcaag 


13500 


gtcatgacct 


catgggcagt 


gtcaccctgg 


accactttgg 


ggaggtgcca 


ggtggcccgg 


13560 


ctagagccag 


ccagggccgt 


gaagtccagt 


ggcaggtgtt 


tgtccccagt 


gctgagagca 


13620 


gagagaagct 


catctcgctg 


ttggctcgcc 


agtgggaggc 


cctgtgtggc 


cgtgagctgc 


13680 


ctgtcgagct 


caccggctag 


cccaggccac 


agccagcctg 


tcgtgtccag 


cctgacgcct 


13740 


actggggcag 


ggcagcaggc 


ttttgtgttc 


tctaaaaatg 


ttttatcctc 


cctttggtac 


13800 


cttaatttga 


ctgtcctcgc 


agagaatgtg 


aacatgtgtg 


tgtgttgtgt 


taattctttc 


13860 


tcatgttggg 


agtgagaatg 


ccgggcccct 


cagggctgtc 


ggtgtgctgt 


cagcctccca 


13920 


caggtggtac 


agccgtgcac 


accagtgtcg 


tgtctgctgt 


tgtgggaccg 


ttgttaacac 


13980 


gtgacactgt 


gggtctgact 


ttctcttcta 


cacgtccttt 


cctgaagtgt 


cgagtccagt 


14040 


cctttgttgc 


tgttgctgtt 


gctgttgctg 


ttgctgttgg 


catcttgctg 


ctaatcctga 


14100 


ggctggtagc 


agaatgcaca 


ttggaagctc 


ccaccccata 


ttgttcttca 


aagtggaggt 


14160 


ctcccctgat 


ccagacaagt 


gggagagccc 


gtgggggcag 


gggacctgga 


gctgccagca 


14220 


ccaagcgtga 


ttcctgctgc 


ctgtattctc 


tattccaata 


aagcagagtt 


tgacaccgtc 


14280 


tgcatcttct 


aaaccaaggg 


tcactgggat 


cgagtcgacg 


gccctatagt 


gagtcgtatt 


14340 


agagctcgcg 


gccgcgagct 


ctagatgcat 


gctcgagcgg 


ccgccagtgt 


gatggatatc 


14400 


tgcagaattc 


cagcacactg 


gcggccgtta 


ctagtggatc 


cgagctccac 


agaggtggtc 


14460 


gatgagggct 


gccctttccc 


acatccttag 


tagggggttc 


aagatgaccc 


agactgtgcc 


14520 


cctggggagc 


ttggagccat 


gcgggaggat 


gagccatgtg 


ctggaggaga 


acagggtagg 


14580 


atggtgtggg 


gcttttgtag 


actgtctaga 


agcaaagaag 


gtctgcagtg 


gaggtggtgt 


14640 


ctgaggtgaa 


tctcgaaggt 


gaataggagt 


tgaacgttag 


caggcagagg 


gtggattgca 


14700 


ggagagcagc 


ggcctgggca 


ggtgcccagc 


gtggcccatc 


agggtgcttc 


atgcatggct 


14760 


gtgtgcttgc 


catccttcct 


gcctgcctac 


cccctgctgc 


ttcgcttcat 


gggggcgttt 


14820 


gagcttgggc 


ccacctgcct 


gcctcgcttg 


tgggcagagg 


acccaagctg 


tgtgagttgt 


14880 


cctgtcccgg 


ggagcagctg 


aactggtccg 


gggtctcgaa 


ctgtggggct 


caaaaggact 


14940 


ccggggtcat 


ttcactgggg 


ctgtgccgat 


tcctgggggc 


tgttnggaan 


gtaaaggcct 


15000 


aaaggggctc 


cctggttang 


gccctcaant 


ttaanaacct 


ggggccgggg 


cccggaattg 


15060 
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cccccaantt tgtttcaacn ccccttggcc ttnggcnggg gcaaatttcc anggggaacc 15120 
aatggntttc ccccaaaaan ggggccnttt taacccnttt ccaaantttg ggncctaaaa 15180 
aagggtggan ttcctgaang gg 15202 



<210> 22 
<211> 1070 
<212> PRT 

<213> Homo sapiens 



<400> 22 

Val Cys Leu Asp Asp Thr Val Thr Thr Glu Lys Glu Leu Asp Thr Val 
15 10 15 

Glu Val Leu Lys Ala lie Gin Lys Ala Lys Glu Val Lys Ser Lys Leu 
20 25 30 

Ser Asn Pro Glu Lys Lys Gly Gly Glu Asp Ser Arg Leu Ser Ala Ala 
35 40 45 

Pro Cys lie Arg Pro Ser Ser Ser Pro Pro Thr Val Ala Pro Ala Ser 



Ala Ser Leu Pro Gin Pro lie Leu Ser Asn Gin Gly lie Met Phe Val 
65 70 75 80 

Gin Glu Glu Ala Leu Ala Ser Ser Leu Ser Ser Thr Asp Ser Leu Thr 
85 90 95 

Pro Glu His Gin Pro lie Ala Gin Gly Cys Ser Asp Ser Leu Glu Ser 
100 105 110 

lie Pro Ala Gly Gin Ala Ala Ser Asp Asp Leu Arg Asp Val Pro Gly 
115 120 125 

Ala Val Gly Gly Ala Ser Pro Glu His Ala Glu Pro Glu Val Gin Val 
130 135 140 

Val Pro Gly Ser Gly Gin lie lie Phe Leu Pro Phe Thr Cys lie Gly 
145 150 155 160 

Tyr Thr Ala Thr Asn Gin Asp Phe lie Gin Arg Leu Ser Thr Leu lie 
165 170 175 

Arg Gin Ala lie Glu Arg Gin Leu Pro Ala Trp lie Glu Ala Ala Asn 
180 185 190 

Gin Arg Glu Glu Gly Gin Gly Glu Gin Gly Glu Glu Glu Asp Glu Glu 
195 200 205 

Glu Glu Glu Glu Glu Asp Val Ala Glu Asn Arg Tyr Phe Glu Met Gly 
210 215 220 

Pro Pro Asp Val Glu Glu Glu Glu Gly Gly Gly Gin Gly Glu Glu Glu 
225 230 235 240 
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Glu Glu Glu Glu Glu Asp Glu Glu Ala Glu Glu Glu Arg Leu Ala Leu 
245 250 255 



Glu Trp Ala Leu 
260 

lie Leu Lys Val 
275 

Arg Gin Phe Ala 
290 

Phe Glu lie Pro 
305 

Ser Ser Leu Arg 



Phe Gly Phe Leu 
340 

Ser Glu Asn Thr 
355 

Phe His Ala Asp 
370 



Leu Cys Ser Pro 
385 

Leu Arg Gin Leu 



Arg Ser Gin Gly 
420 

Met Val Gin Thr 
435 

Ser Cys Thr Leu 
450 

Glu Ala Val Lys 
465 

Gin His Leu Asn 



Gly Thr His Asn 
500 

Pro Leu Ser Thr 
515 

Arg Gly Ala Phe 
530 

Arg Phe Val Thr 
545 

Leu Arg Val Tyr 



Gly Ala Asp Glu 



Leu Trp Cys Phe 
280 

Ala Cys Leu Val 
295 

His Gin Glu Ser 
310 

Phe Val Phe Cys 
325 

Met Pro Glu Leu 



Leu Phe lie lie 
360 



Leu Arg Ser Cys 
375 



lie Leu Tyr Gly 
390 



Leu Thr Phe Tyr 
405 

Cys Phe Pro Val 



Ala Ala Gly Asp 
440 

Cys Ser Ala Val 
455 

Ser Ala Ala lie 
470 

Val lie Lys Ala 
485 

Cys Arg Asn Arg 



Val Leu Leu Asp 
520 

Ala Asp Gly His 
535 

Ala lie Phe Val 
550 

Asn Gin Leu Arg 
565 



Asp Phe Leu Leu 
265 

Leu lie His Val 



Leu Thr Asp Phe 
300 

Arg Gly Ser Ser 
315 

Phe Pro His Gly 
330 

Cys Leu Val Leu 
345 

Ser Asp Ala Ala 



Phe Ala Pro Gin 
380 

Ser His Thr Ser 
395 

Lys Val Ala Gly 
410 

Tyr Leu Val Tyr 
425 

Tyr Ser Gly Asn 



Arg Arg Ser Cys 
460 

Pro Tyr Trp Leu 
475 



Asp Phe Asn Pro 
490 



Asn Ser Phe Lys 
505 

Pro Thr Arg Ser 



Val Leu Glu Leu 
540 

Leu Pro His Glu 
555 

Ala Ser Leu Gin 
570 



Glu His lie Arg 
270 



Gin Gly Ser He 
285 

Gly He Ala Val 



Gin His He Leu 
320 

Asp Leu Thr Glu 
335 



Lys Val Arg His 
350 

Asn Leu His Glu 
365 

His Met Ala Met 



Leu Gin Glu Phe 
400 

Gly Cys Gin Glu 
415 

Ser Asp Lys Arg 
430 

lie Glu Trp Ala 
445 

Cys Ala Pro Ser 



Leu Leu Thr Pro 
480 

Met Pro Asn Arg 
495 



Leu Ser Arg Val 
510 

Cys Thr Gin Pro 
525 

Leu Val Gly Tyr 



Lys Phe His Phe 
560 

Asp Leu Lys Thr 
575 
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Val Val lie Ala Lys Thr Pro Gly Thr Gly Gly Ser Pro Gin Gly Ser 
580 585 590 



Phe Ala Asp Gly 
595 

Pro Gin Glu Val 
610 

Pro Ala Pro Ala 
625 

Pro Ala Pro Ala 



Pro Val Glu Ala 
660 

Pro Ser Glu His 
675 

Ser His Leu Pro 
690 

Gly Ser Ala lie 
705 

Asn Glu Glu Leu 



Thr Pro Gly Leu 
740 

Val Tyr Phe Val 
755 

Leu Gin Asp Phe 
770 

Phe His lie Ser 
785 

Val Asn Val Gly 



Pro Met Gin Val 
820 

Cys Phe Leu Gin 

835 

Pro Ser Pro Glu 
850 

Lys Thr Thr Gly 
865 

Val Lys Phe Thr 



Thr Val Ala Gin 
900 



Gin Pro Ala Glu 
600 

Pro Ala Glu Ala 
615 

Pro Ala Ala Ala 
630 

Glu Ala Ser Thr 
645 

Pro Ala Pro Pro 



Leu lie Gin Ala 
680 

Ala Cys Pro Ser 
695 

lie Glu Leu Phe 
710 

Arg His Leu Met 
725 

Glu Val Thr Ala 



Leu His Asp Gly 
760 



Trp His Gin Lys 
775 

Gin Cys Phe Val 
790 

Leu Phe Asp Gin 
805 

Val Thr Cys Leu 



His Leu Met Val 
840 

Pro Val Asp Lys 
855 

Lys Met Glu Asn 
870 

Tyr Pro Ser Glu 
885 

Lys Met Ala Glu 



Arg Arg Ala Ser 



Leu Ala Pro Ala 
62 0 

Ser Ala Ser Gly 
635 

Ser Ala Leu Val 

650 

Pro Ala Glu Ala 
665 

Thr Ser Glu Glu 



Leu Arg His Val 
700 

His Ser Ser lie 
715 

Trp Ser Ser Val 
730 

Cys Val Leu Leu 
745 

Leu Arg Arg Tyr 



Asn Thr Asp Tyr 
780 

Leu Lys Leu Ser 
795 

His Phe Arg Leu 
810 



Thr Arg Asp Ser 
825 

Val Leu Ser Ser 



Asp Phe Tyr Ser 
860 

Tyr Glu Leu lie 
875 

Glu Glu He Gly 
890 

Pro Glu Lys Ala 
905 



Asn Asp Gin Arg 

605 

Pro Val Glu Val 



Pro Ala Lys Thr 
640 

Pro Glu Glu Thr 
655 

Pro Ala Gin Tyr 
670 



Asn Gin He Pro 
685 

Ala Ser Leu Arg 



Ala Glu Val Glu 
720 

Val Phe Tyr Gin 
735 

Ser Thr Lys Ala 
750 



Phe Ser Glu Pro 
765 

Asn Asn Ser Pro 



Asp Leu Gin Ser 
800 

Thr Gly Ser Thr 
815 

Tyr Leu Thr His 
830 



Leu Glu Arg Thr 
845 

Glu Phe Gly Asn 



His Ser Ser Arg 
880 



Asp Leu Thr Phe 
895 

Pro Ala Leu Ser 
910 
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lie Leu Leu Tyr Val Gin Ala Phe Gin Val Gly Met Pro Pro Pro Gly 
915 920 925 



Cys Cys Arg Gly Pro Leu Arg Pro Lys Thr Leu Leu Leu Thr Ser Ser 
930 935 940 

Glu lie Phe Leu Leu Asp Glu Asp Cys Val His Tyr Pro Leu Pro Glu 
945 950 955 960 

Phe Ala Lys Glu Pro Pro Gin Arg Asp Arg Tyr Arg Leu Asp Asp Gly 
965 970 975 

Arg Arg Val Arg Asp Leu Asp Arg Val Leu Met Gly Tyr Gin Thr Tyr 
980 985 990 

Pro Gin Ala Leu Thr Leu Val Phe Asp Asp Val Gin Gly His Asp Leu 
995 1000 1005 

Met Gly Ser Val Thr Leu Asp His Phe Gly Glu Val Pro Gly Gly Pro 
1010 1015 1020 

Ala Arg Ala Ser Gin Gly Arg Glu Val Gin Trp Gin Val Phe Val Pro 
1025 1030 1035 1040 

Ser Ala Glu Ser Arg Glu Lys Leu lie Ser Leu Leu Ala Arg Gin Trp 
1045 1050 1055 

Glu Ala Leu Cys Gly Arg Glu Leu Pro Val Glu Leu Thr Gly 
1060 1065 1070 
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